J
OURNAL DE
T
HÉORIE DES
N
OMBRES DE
B
ORDEAUX
M
ICHAEL
N
AKAMAYE
Diophantine approximation on algebraic varieties
Journal de Théorie des Nombres de Bordeaux, tome
11, n
o2 (1999),
p. 439-502
<http://www.numdam.org/item?id=JTNB_1999__11_2_439_0>
© Université Bordeaux 1, 1999, tous droits réservés.
L’accès aux archives de la revue « Journal de Théorie des Nombres de Bordeaux » (http://jtnb.cedram.org/) implique l’accord avec les condi-tions générales d’utilisation (http://www.numdam.org/conditions). Toute uti-lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte-nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
Diophantine
Approximation
onAlgebraic
Varieties
par MICHAEL NAKAMAYE
RÉSUMÉ. Nous donnons un apergu de
progrès
récents en théoriede
1’approximation diophantienne.
Lepoint
dedépart
étant lethéorème de
Roth,
nous nous intéressons d’abord à laconjecture
de
Mordell, puis
ensuite à des résultatsanalogues
en dimensionsupérieure,
résultats dûs àFaltings-Wustholz
et àFaltings.
ABSTRACT. We present an overview of recent advances indio-phantine approximation. Beginning
with Roth’stheorem,
wedis-cuss the Mordell conjecture and then pass on to recent
higher
dimensional results due toFaltings-Wustholz
and toFaltings
re-spectively.
0. INTRODUCTION
The
theory
ofdiophantine approximation
oftenbegins
with a finitecol-lection of
polynomials
f 1 (X), ..., In (X) E
(~~.X l, ... , Xm)
with rationalco-efficients. There are then two distinct
types
ofquestions commonly
asked.First,
one can look for rationalpoints
(pl/ql, ...
which lie on thecommon set of zeroes of the
polynomials,
i.e.pm/qm)
= 0 forall i. This line of
inquiry
leads to the Mordellconjecture
(which
deals with the case of asingle polynomial
in twovariables)
andhigher
dimensionalgeneralizations
due toFaltings
andVojta. Alternatively,
one can look forrational
points
(pl/ql, ... ,Pm/qm)
which are’approximate
solutions’ or inother words lie very close the set of common zeroes of the
f i’s.
Thisquestion
is
perhaps
older than the first and isaptly
calleddiophantine
approxima-tion because one is
looking
not for solutions todiophantine
equations
butfor
approximate
solutions. In thisdirection,
the moststriking
results areRoth’s
theorem, dealing
with the case of asingle
irreduciblepolynomial
inone
variable,
and the Schmidtsubspace
theorem which allows for severalvariables but deals with very
specific
(linear)
polynomials.
One of the
simplest
results indiophantine approximation
is Liouville’stheorem. For
this,
we consider asingle
irreduciblepolynomial
f (X )
EQ(X~
of
degree
d > 2 with a root a E R. Sincef
isirreducible, a g Q
and it makes sense toapproximate
this zero off (X)
by
rational numbers.Liou-ville’s theorem states that for any rational
p/q
onealways
has aninequality
of the form
where
c(a)
is anexplicity
determined constantdepending
only
on a. Itturns
out,
however,
that theexponent
d in the denominator is not bestpossible
except
forquadratic
f (X ).
Infact,
Rothproved
that 0.1 can beimproved by replacing
d with 2 + E for any f > 0. Thedisadvantage
toRoth’s theorem is that the constant
c(a)
is nolonger explicitly
determined.Roth’s theorem was
generalized
fifteen years laterby
Schmidt who dealswith the case of n
independent
linear formsLi(X)
Ek[Xl,... Xnj
whereQ
has now beenreplaced by
a finite extensionk/Q.
The Schmidtsubspace
theorem
(see
§5
for anexplicit
statement)
states that the set of rationalpoints
(pl/ql, ... , pn/qn)
which are close to the n linearsubspaces
0 are
degenerate;
heredegenerate
means contained in a finite union ofproper linear
subspaces.
In the one variable case,choosing
L(X)
= X - a, one recovers Roth’s theorem since adegenerate
linearsubspace
ofQ
isjust
apoint.
Schmidt’ssubspace
theorem hasrecently
been extendedby
Faltings
and Wüstholz to deal with the case when theLi
are nolonger
linear.
Returning
to the firstquestion
posed
indiophantine approximation
wetake the
subvariety
X C pn defined overQ
as the common zero locus of our collection ofpolynomials.
As notedabove,
in this case one is not’ap-proximating’
ageometric object
with rationalpoints
but ratherlooking
foractual rational solutions to the
system
ofequations
defining
X. Thecon-nection between the
problem
offinding
actual asopposed
toapproximate
solutions to
polynomial
equations
wasdeveloped by Vojta
in hisproof
ofthe Mordell
conjecture
[V2].
The Mordellconjecture
states that if X is asmooth
projective
curve of genus at least2,
defined over a number fieldk,
then X hasonly finitely
many rationalpoints.
Like Roth’stheorem,
the result is ineffective in that it does not bound the size of the solutions.
Building
uponVojta’s
ideas,
Faltings
[Fl, F2]
was able to obtain similarresults for
higher
dimensional varieties under certain restrictions.The
goal
of these notes is togive
an introduction to the ideas andargu-ments
occurring
indiophantine approximation, beginning
with thesimplest
result,
Liouville’stheorem,
andculminating
with the difficult results ofFalt-ings
andVojta
on the one hand andFaltings-Wilstholz
on the other. Therough
structure of these notes is as follows: the first sectionbegins
withLiouville’s theorem and
procedes
toanalyse
the difficulties encountered whentrying
tosharpen
Liouville’s result to obtain Roth’s theorem. Thesecond section then deals with these
difficulties, concentrating
on thegeom-etry
behind Roth’stheorem,
specifically
theproduct
theorem andDyson’s
lemma. The third lecture moves from Roth’s theorem toVojta’s proof
of the Mordellconjecture, developing
the necessarytheory
ofheight
functions and thecorresponding
metrics on line bundlesalong
the way. We will once moreemphasize
thegeometric
aspects
of theproof
andgive
arough
idea ofthe difficult arithmetic
machinery
involved. Lectures four and five will be devoted tohigher
dimensionalproblems,
one tohigher
dimensional Mordelltype
theorems and one to the Schmidtsubspace
theoremrespectively.
Theproofs presented
here willnot,
ingeneral,
becomplete
but wehope
toem-phasize
all of theimportant points
so that the interested reader can thenconsult the literature for
rigorous proofs
andhopefully
the account here will facilitate this process.There are several
expositions
of the material covered in these lectures.For Roth’s theorem and the Schmidt
subspace
theorem one can consultSchmidt’s two collections of lecture notes
[Sl, S2].
ForDyson’s
lemma,
in addition toDyson’s original
paper[D],
one can consult[Bl,
EV,
Nl,
N3].
Vojta
has twoproofs
of the Mordellconjecture,
[V2, V3],
the second ofwhich is closer to the
exposition
given
here. He also has a beautiful paper[V4]
giving
a fullproof
of thehigher
dimensional results ofFaltings
[Fl , F2].
For surveys of the material behind
Faltings’ work,
one can consult[H]
for a beautiful overview or[EE]
for a morethorough
treatment.These notes form the basis of a series of five lectures delivered at the Isaac Newton Institute from March 23
through
March27,
1998. It is apleasure
to thank the
organizers
of theworkshop, Christophe Soul6,
Jean-LouisColhot-Th6l6ne,
and JanNekováf,
forinviting
me topresent
this material.I would also like to thank all of those who attended the lectures and whose
comments have
helped
to remove many errors from these notes as well asto add better and more
complete explanations.
I also thank those whoattended courses I gave on
part
of this material at Harvard in thespring
of 1995 and atBayreuth
in the winter of 1996: it isonly
aftertrying
to teach the material several times that I have reached a fullappreciation
of thesubtleties involved. The
spirit
of these notes ishopefully
the same as thatof the
lectures, namely
informalbut aiming
togive
areasonably complete
picture
of the difficulttechniques
involved in thesequestions
of arithmeticgeometry.
1. FROM LIOUVILLE TO ROTH
We
begin by recalling
thequick
andelegant proof
of Liouville’s theorem.Theorem 1.1
(Liouville).
Suppose
a E R is analgebraic
irrationalThe
proof
can beformally
divided into threestages
which will reappearin all of our
arguments
but which areparticularly
transparent
in this case.We
begin
with the irreduciblepolynomial
f (X )
EQ[X]
for a overQ.
Tospecify f uniquely
one can takef
EZ[X]
withrelatively
prime
coefficients.The outline of the
argument
is as follows:Step
1: 0 for anyp/q
EQ
since otherwisef
would not beirreducible over
Q.
Step
2: sinceqd
is a common denominator for the termsin
1(P/q).
Step
3:a~
for anexplicit
constantb(a).
Liouville’s theorem
follows,
withc(a)
=1/2b(a),
by comparing
the boundsin
Steps
2 and 3.Only
the upper bound inStep
3requires
further comment.Suppose
we take theTaylor
seriesexpansion
of f (X )
about a. Sincef (a) _
0 the first term is zero:
Thus
This establishes
Step
3,
provided
lplq -
al
1,
withb(a) =
Ed 1
jail
and proves Liouville’s theorem(taking
c(a)
= As foreffectivity,
the numbers aidepend only
on a asthey
are the coefficients ofthe
Taylor
seriesexpansion
off(X)
about a.Consequently,
b(a)
andc(a)
depend only
on a.Suppose
now that one tries toimprove
theexponent
d in Liouville’stheorem. One idea would be to run
through
the sameargument
with adifferent
auxiliarly polynomial
f (X).
Of course this riskslosing effectivity
because
f (X)
was determineduniquely by
a.Nonetheless,
one couldtry.
Areview of the
argument
reveals that theexponent
d in the theorem comesby taking
thequotient
ofdeg f(X)
by
multa ( f (X )).
Thus if wereplace
f (X )
with a differentpolynomial
g(X)
which also vanishes at a we willget
But for
Step
2 towork,
we need to haveg(X)
EQ[X]
and thusg(X)
will be divisible
by
Thus wealways
will havedegg(X)/multa(g(X))
> d and there is noimprovement.
Thus some new idea is needed in order to
improve
theexponent
d of Liouville’s theorem. At thebeginning
of thecentury,
Thue had the idea ofrunning
the sameargument
with anauxiliary polynomial
in two variables.Thus in order to bound how close a
single
rational number can be to a,Thue considers a
pair
of rational numbers both of which areclose to a. The reason
why
an extraapproximating
rationalpoint helps
is that now theauxiliary polynomial
will be in two variables and thisgives
additionaldegrees
of freedomallowing
for animprovement
in theimportant
ratio
which occurs as the
exponent
in Liouville’s theorem. There are of courseimmediate
complications
in theargument,
particularly
withStep
1 as withtwo variables there will not be a well-defined choice of
auxiliary
polyno-mial with
big
multiplicity
at(a,
a).
But without a canonical choice off (X,Y)
there is nolonger
any reasonwhy
f (pl/q,,P2/q2)
should benon-zero, without which the
argument
runsaground.
Thus thesimplest
step
inthe
proof
of Liouville’stheorem, Step
1,
becomes the most cult whenone considers an
auxiliary
function in more than one variable.Of course, once one is
willing
to considerf (X, Y)
there is no reason tostop
there and Roth wasfinally
able to obtain the best resultpossible by
considering
anauxiliary polynomial
with anarbitrary
number of variables:Theorem 1.2
(Roth’s Theorem).
Suppose
a E R isalgebraic
andirra-tional. Then
for
any e > 0 there areonly finitely
many solutions toNote that the formulation of Roth’s theorem is
slightly
different from that of Liouville’s theorem. It follows from the fact that 1.3 hasonly
finitely
many solutions that there exists a constant
c(a)
such thatThe
problem
is that the constantc(a)
is not effective and this iswhy
Roth’s theorem is formulated in this alternativestyle.
The
strategy
forproving
Roth’s theorem is identical to that of Liouville’s theoremexcept
that we will use severalgood
approximating
points.
So weassume that Roth’s theorem is false and this
gives
a sequence ofgood
approximating points
The
strategy
for theproof
will be to choose m of thesegood approximating
points,
where m -+ oo as e --~0,
and construct anauxiliary polynomial
in m variables with
large
ordervanishing
at(a, ... , a).
Steps
2 and 3 ofLiouville’s theorem will then tell us that the
approximations
are toogood,
giving
a contradiction. In order forSteps
2 and 3 toyield
acontradiction,
one needs to choose the
appropriate
notion for the order ofvanishing
of ourpolynomial
f (X 1, ... , X,,,~ )
at(a, ... ,
a) .
We will see as we gothrough
withthe
proof
that thefollowing,
though
at firstawkward,
is theapt
definition:D efinit ion 1.5. Let
0 7~ /(Xi,...,X~)
E and let(al, ... , 7 am)
E Consider theTaylor
seriesexpansion
off
about(a , ... ,
7 am):
Suppose f
hasmulti-degree
(dl , ... ,
The ind ex off
at(a~ , ... ,
is defined as follows:
The index of a at a
point
x E cm is aweighted multiplicity.
Forexample,
ifdi ==...== dm
= d thenIn
general,
each variable isweighted
so that it can contribute a maximumof 1 to the index.
To see the relevance of the notion of
index,
we runthrough
the threestep
proof
of Roth’s theorem which we would like to model after Liouville’stheorem: we construct an
auxiliary polynomial
f (Xl, ... ,
ofmulti-degree
(dl, ... ,
withlarge
index at thepoint
(a, ... , a).
Then theproof
shouldprocede
as follows:Step
1: Show that 0.Step
2: Ias in Liouville’s theorem.
Step
3:If(PI/ql,...
I
is small sincelpilqi - al
is small andf
As it
stands,
much work is needed in order to make this into aproof.
Only Step
2requires
no furtherjustification.
Webegin by explaining
theupper bound for
,Pm/qm)1
inStep
3 as this is what motivatesthe notion of index. As we did in Liouville’s
theorem,
consider theTaylor
series
expansion
off
about(a, ... , a):
Using
1.4 willgive
thefollowing
upper boundfor some constant C which
depends
on the number of terms in 1.6 and the size of the coefficients of P. We see,ignoring
the constant C for themoment,
that 1.7 contradicts the lower bound ofStep
2provided provided
aj=0
wheneverSince we do not know the relative sizes of the qz, the
only
reasonable wayto
guarantee
inequality
1.8 isby choosing di
so thatqai
are allroughly
proportional,
i.e. we need to chooseMoreover,
with this choiceof di
inequality
1.8 translates into thefollowing:
we want
aj=0
wheneverAnd here the index has
finally reappeared
as a natural consequence of theargument:
indeed,
1.10 says that we wantTo
summarize,
the lower and upper bounds forgiven
in
Steps
2 and 3respectively
contradict oneanother, assuming
we cansuitably
bound the constant C in1.7,
provided
thatm/(2+e).
At this
point
there are three technical issues to deal with in order toProblem 1: Show that
f (pl/ql, ... ,
0. Problem 2: Bound the constant Coccurring
in 1.7.Problem 3: Show that one can find
0 ~
f
with index >m/(2 + e)
at(a, ... , a).
We will deal with the three
problems
in reverse order. Problem 3 isessentially
acounting problem.
We want to killleading
terms aj in theTaylor
seriesexpansion
1.6provided
J satisfies 1.10. In order toguarantee
that one can force all of these a~ to
vanish,
it suffces to show that the numberof m-tuples
( jl, ... ,
jm)
with0 j~ c~
satisfying
1.10 is a smallf raction
of the total number ofmonomials;
for if this is true then adimen-sion count shows that there are lots of
polynomials,
with coefficients in afinite extension
having
index at leastm/(2
+e)
at(a, ... , a)
and all of its Galoisconjugates. Taking
the norm overQ
of such apolynomial
and
clearing
denominatorsgives
the desiredauxiliary
polynomial.
To showthat all of this works in our
particular setting
is aspecial
case of whatFalt-ings
and Wiistholz[FW2]
refer to as the ’law oflarge
of numbers’.Rougly
speaking,
this says thefollowing:
for eachi,
the"average"
value ofji/ dï
is
1/2
so if onerandomly
chooses a set theprobability
that 1.10is violated
approaches
zero as mapproaches infinity.
For arigorous
state-ment andproof
one can consult[FW2]
Proposition
5.1.Alternatively,
forthe case in which we are
interested,
one can make anexplicit
computation
as in[L1]
pp. 170-171.Next we deal with Problem 2. There are two
separate
issueshere,
firstcounting
the number of terms aj and secondbounding
the size of thejail.
The first of these is
simple
as the number of aj isBounding
the size of the aj isequivalent
tobounding
the size of theco-efficients
of theauxiliary polynomial
f (X)
which ispotentially
a serious manner asf
was constructedabstractly
tosatisfy
certainvanishing
condi-tions. The fact that the size of the coefficients off
can be controlled is aconsequence of the famous
Siegel
lemma:Lemma 1.11
(Siegel’s Lemma).
Consider asystem
of
rra linearequations
in n
unknowns,
with m n:Suppose
az j E Zfor
alli, j
and thatlaijl
I
Afor
alli, j .
Then thereexists a solution
(xl, ... , xn)
E Zn to thesystem
of
equations
withixi
I
We
apply
Siegel’s
lemmaby viewing
the conditionind~a ... a~ ( f )
>m/(2+
e)
as asystem
of linearequations
in thecoefficients
off ,
i.e. we view thecoefficients
of f as
variables.Siegel’s
lemma cannot beapplied directly
inthis situation because our
system
of linearequations
will have coefficientsin the number field K =
Q(a)
since we aretrying
toimpose
large
index ata.
Choosing
a basis for K as a vector space overQ
allows one to make thereduction to Lemma 1.11
(for
details,
see[S2]
Lemma9A).
One thenob-tains a
polynomial g
E0~[Yi,... ,
withlarge
indexalong
(a, ... ,
a)
and itsconjugates
such that the coefficients have boundedsize;
taking
thenorm
of g
thengives
the desiredf .
The crucial observation to make about
Siegel’s
lemma is that theex-ponent
m/(n - m)
is smallprovided
that the number of unknowns is afixed
multiple
of the number ofequations.
At thispoint,
one needs to doa lot of
accounting
to check thatSiegel’s
lemmagives
anauxiliary
polyno-mial
f E Z(Xl, ... ,
such that the constant C in 1.7 is smallenough
to obtain a contradiction when
comparing
the lower and upper bounds onIn
practice,
toget
the numbers to work out onechooses m
big enough
toimpose
indexm/(2
+ e/2)
at(a, ... ,
a)
and theextra
e/2
allows one to absorb the bound for C in 1.7 and still obtain acontradiction when
comparing
the numbers inSteps
2 and 3.2. INTERLUDE: DYSON’S LEMMA
We still have not faced the most difficult of the obstructions to
proving
Roth’s
theorem,
namely
the issuearising
at thebeginning
of theargument
in
Step
1: how can weguarantee
thatWithout this
information,
all of ourcomputations
to obtain a contradictionfrom
comparing
upper and lower bounds of[
are invain. This is
by
far the most difficultpart
of Roth’s theorem as the restof the
argument
essentially
consists incounting.
Ingeneral,
there is ofcourse no way to
guarantee
that 0 becausef
hasnot been
explicity
constructed: wesimply
know from dimensioncounting
that such an
f
exists but of course there may be several and most of them will vanish at theapproximating
point
Rothoriginally
dealt with thisproblem
in anessentially
arithmetic manner. Heproved,
in what is now called Roth’s
lemma,
thatprovided
di
Wd2 ...
»d",,
(or,
equivalently, given
1.9,
ql « q2 « ... «q."a)
thepolynomial
f (X )
constructedusing Siegel’s
lemma cannot havelarge
order ofvanishing
at(pi/9i)’ "
Taking
theappropriate
derivatives off
thenyields
acontradiction
(since
the number of derivatives issmall,
it does not affect the contradiction arrived at inSteps
2 and 3 of theargument).
Roth’slemma is
essentially
arithmetic innature,
using
the facts thatf
hasinteger
coefficients and that we are interested in its order ofvanishing
at a rationalpoint.
For aproof,
one can consult[Ll]
pp. 179-181.An alternative
geometric
argument
to establishnon-vanishing
of,pm/4m)
wasdeveloped thirty
years laterby
Esnault andViehweg
[EV]
building
uponprevious
work ofDyson
[D],
Bombieri[Bl],
and Viola
[Vi].
Esnault andViehweg approach
theproblem
from thefollowing
point
of view.Suppose
that whicheverf (X)
we choose withlarge
index at(a, ... ,
a)
wealways
find thatf (X )
also haslarge
index at(pl/qi,...
This says that certain linear conditions on the spaceof all
polynomials
ofdegree
(dl, ... ,
fail to beindependent.
Thus ifone could establish this
independence
then this would alsoyield
acontra-diction. The
advantage
to thisviewpoint
is that it isentirely
geometric;
the fact that thepoints
in which we are interested are allalgebraic
becomesunimportant.
To state the
powerful
result which Esnault andViehweg
were able toprove, we introduce some new notation. Let
be a
product
of mprojective
lines defined overk,
analgebraically
closedfield of characteristic zero. We will assume for
simplicity
that k -C,
the field of
complex
numbersalthough
theargument
remains valid in thegeneral
case.Let 7ri :
P -Pl
be theprojection
to theith
factor. Forpositive integers d 1, ... ,
dm
write d =( d 1, ... ,
dm )
andFor fixed d let
0 # s
EH°(P,Op(d)).
The index of s at a closedpoint
,
of P is definedby locally identifying s
with apolynomial
andapplying
Definition 1.5. We need to define certain volumes as in
[EV]
Definition 0.2:Definition 2.1. Let 1m =
l~ = (~1, ... ,
£m)
E Rm0 ~~
1 for alli}
and letVol(t)
denote the volume ofI
Of course 1 -
Vol(t)
measures,asymtotically
ind,
theproportion
ofsections of
HO (P, Op (d))
with index > t at apoint
C. Dyson’s
lemmagives
conditions on a set ofpoints
{~~}
C P and a line bundleOp (d)
so that
requiring
index ti
imposes
almostindependent
conditions onglobal
sections of Thefollowing
is analternative,
slightly
moretransparent
formulation of Esnault andViehweg’s Dyson
lemma:Theorem 2.2.
Suppose 0 f=
s EH°(P,
and~1, ... , Cm
C P soNote that
if
t 0 thenVol(t)
= 0.Theorem 2.2 says that the linear conditions for
imposing
index ti
- m8are
independent
insideH°(P, Op(d)).
Thekey
observation aboutTheorem 2.2 is that if
di
»d2
» ... » then mJ is small so in this caseTheorem 2.2 says that the conditions to
impose index ti
at(i
arenearly
independent
inside This will then allow us to derive thedesired contradiction and prove Roth’s
theorem,
provided
we can arrangefor
dl
Wd2
» ... »d",,.
Butby
1.9 this isequivalent
tochoosing
rationalapproximations
with denominatorssatisfying
ql C q2 « ... « q",, and we can achieve this since we have assumed that there areinfinitely
manygood
rational
approximating points.
It should be noted that there is still somearguing
left becauseDyson’s
lemmaonly
shows that there existsg(X)
EC[X1,... Xm]
withlarge
index at(a, ... ,
a)
and itsconjugates
which has small order ofvanishing
at(pllql,... ,P."6/q",,).
From here one canconstruct
f (X )
EZ[X1,... Xm]
withlarge
index at(a, ... , a)
and small index at(pi/9i,...
but apriori
we have no control on the size ofthe coefficients of
f .
Thus more calculations of dimensions and ofSiegel
type
are necessary(see
[EV]
§10
fordetails).
We now turn to the
proof
of Theorem 2.2. In the statement of Theorem2.2,
the volumes which occur have beenperturbed by
m8. The reason forthis is that suitable derivatives of the non-zero section 8 E
with index
at (;
still haveindex > ti
- m8at
z.
Theproduct
theoremwill tell us that the common zero locus of these sections is contained in a
finite union of proper
product
subvarieties. Let P’ C P be one of theseproduct
subvarieties and let 7r : P - P’ denote the naturalprojection.
Taking Ci’
= one thenshows,
except
in certaindegenerate
cases, thaton P’ one can
produce
a new non-zero section s’ EH°
(P’, Op (d))
with suitable indices at the(I.
Thisprocedure
is then iterated toproduce
azero
cycle
representing
and thepart
of the class associatedto
~’=
measures the cost ofimposing
index ti
- m6at
~t.
These classes havedisjoint
supports
and thisimplies
the desired conclusion: the cost ofimposing
index ti
- mJat (i
for all i is the sum of the individualcosts,
i.e.the conditions are
independent.
The details of the
proof
of Theorem 2.2 are as follows. We aregiven
a non-zero section s EH°(P,
with ti
=ind(¡ (s ).
Given apoint
(
EP,
an af&ne open subset U =
Spec
A CP,
and a local trivialization ofOp
(d)
on
U,
the sections Q EH°(P, Op (d))
such that > tgenerate
anideal c A. This ideal does not
depend
on the trivialization and as Uvaries over an a£ne cover of P this defines an ideal sheaf
2£, d t
c Op .
LetSince we are
trying
to show that the various index conditions areinde-pendent,
as a firstapproximation
one shouldexpect
thesupports
of theseideal sheaves to be
disjoint.
This is not cult to show because one canwrite down
explicit
monomialgenerators
for and hence determine itssupport
(see
[EV]
Lemma2.5).
It then followsreadily
([EV]
Lemma2.8),
as
desired,
thatAs discussed
above,
we would like to be able toproduce
sectionssat-isfying
the relevant index conditions on properproduct
subvarieties. To thisend,
letId (s) =
IÇj,d,tj
and for P’ C P aproduct
subvariety
letOp’
(d)
=Op(d)[P’.
If one isunlucky
and P’ CZi
for some i thenH°
= 0 for all k > 0 so there will be nohope
ofproducing
a new section. On the otherhand,
[EV]
2.9(ii)
implies
thatwhenever
P’ 0
Zi
forany i
then for ksufficiently
divisibleThere is one more crucial
ingredient
in theproof
of Theorem 2.2 as itallows us to use the sections
guaranteed
by
2.4 to construct an intersectionclass
representing
cl( Op (d))m.
This is the so-calledproduct
theorem which establishes a certaindegeneracy
in the locus where the non-zero sections E can have
large
index.Initially inspired by
theproduct
theorem of
Faltings
[Fl],
the resultpresented
herestrengthens
Theorem3.1 of
[Nl].
Theorem 2.5
(Product Theorem).
Letf
EC[Xl, ... , Xm]
beof
multi-degree d 1, ... ,
d,~
andfor
an m -1--tupl e
of
non-negative
integers
a =1
1Let W C
X ( f )
be an irreduciblecomponent.
Then W is contained in a properproduct subvariety.
Proof of Theorem 2.5. We will show that either is a
point
orin a proper
product
subvariety
and we are done. In the latter case, choosea
general
A E C andreplace f
withSince
a/axl, ... ,
commute with evaluation at A we have"""""- --
-B ut
’9011
...irm-2
2
2
f
(
X 1, ... ,X m )
vanishesalong
W’
x C for aZ But5X " ... "!-2 f
(Xi,... X.,,,)
vanishesalong
W’ x C
forai :5
a1 ,gxm 2
assumption.
Henceby
2.5.1Thus, using
the notation in the statement of Theorem2.5,
W’
Cand we can conclude
by
induction on m thatW’,
and henceW ,
must be contained in a properproduct subvariety.
We now
proceed
to establish that if is not apoint
then W =W’ x
C, concluding
theproof
of Theorem 2.5. Choose ageneral
point
’11 E W
so thatLet denote the
tangent
space to W at 17. Since we areassum-ing
that W - C issurjective
itfollows, choosing q
still moregen-eral if necessary, that we may assume the induced map on
tangent
spacesis
surjective. Consequently, viewing
Tl1(W),
aftertranslation to the origin, as a subspace
We claim that for some 1 i m - 1
Observe first that if DO. is any differential
operator
of ordertll(f) - 1
and if
8/8X
ET,7 (W)
then = = 0 sinceD°‘ ( f )
vanishes on W and E ThusSuppose,
contrary
to2.5.5,
that for haveBy
2.5.4 and 2.5.6 we also have But thiswould
imply
that +1,
a contradiction. Thus 2.5.5holds.
Applying
the sameargument
inductively
andusing
2.5.3 shows thatfor some
non-negative
il, ... ,
But
by
the definition ofX ( f ),
= 0 for any0 ~ Qi ~ dj.
Thus+ 1. Since
f
hasdegree
dj
in this isonly possible
if W is of the form W’ x C. This concludes theproof
of theproduct
theorem.Observe that if we
replace
d with Nd for Nsufficiently
divisible then we can assume without loss ofgenerality
that ifZ 0
Zi
forany i
then thereexists
Indeed, applying
theproduct
theorem to the section s of Theorem 2.2gives
sections ofH°
(P,
Op (d)
o whichgenerate
off a finiteunion of proper
product
subvarieties. If P’ is one of these properproduct
subvarieties,
not contained inZi
forany i,
then 2.4gives
a non-zero sectionfor some
positive
integer
n. Since 2.4only
needs to beapplied
finitely
manytimes in order to obtain sections on any
subvariety Z 0 Zi,
we can chooseN
sufficiently
divisible togive
all of the sections sz at once.We will use 2.6 in order to construct an effective
cycle representing
This is done
inductively
as follows. With sp as in2.6,
Z(sp)
represents
cl(Op(d)).
Let Z be an irreduciblecomponent
ofZ(sp).
IfZ C
Zi
for somei,
then choose some0 ~
Sz EH°(Z,Op(d))
andZ(sz)
isan effective
representative
forcl (Op (d)) fl Z.
Suppose
next thatZ 0
Zi
forany i. Then
by
2.6 there exists EH°
and
Z(sz)
represents
cl (Op (d))
fl Z.Repeating
thisprocedure
inductively
gives
an effectiverepresentative
forMoreover,
sinceby
2.3 theZi
aremutually disjoint,
we can writewhere is the
part
of the intersectionsupported
onZ; .
Since R iseffective,
We have
deg
= m!die
Thus to conclude theproof
ofTheorem 2.2 it suffices to prove that for
1 j
M we haveCombining
2.8 with 2.7yields
Theorem 2.2.A
rigorous proof
of 2.8 involvesblowing
up the ideal sheafbut
intuitively
it has an easyexplanation.
Suppose Cl, C2
Cp2
are twodistinct,
irreducible curvesmeeting
in apoint
P.Suppose
moreover thatboth
01
andC2
havemultiplicity
m at P. LetC2 )
denote thepart
of the intersection classC2
supported
at P. Then one hasBut of course
m2
measuresprecisely
the cost ofimposing multiplicity
mat P. In the situation of
2.8,
things
are morecomplicated
because thecondition we are
imposing
will forcevanishing
notonly
atpoints
but alsoalong higher
dimensional subvarieties. The fundamentalprinciple, however,
is the same: the
part
of the intersection class we haveproduced supported
onZj
measuresby
definition the cost ofimposing
index t~
- m6 at~~ .
Theright
hand side of 2.8 is the cost ofixnposing tj -
mbat (;
whence theinequality.
More
precisely,
suppose 7r : Y - P is theblow-up
of Palong
-EC,
-m5 .The ideal sheaf has been defined so that it is
generated locally
by global
sections ofH°
(P, C~p (d) ) .
Hence if E denotes theexceptional
divisor of 7r, it follows that is
numerically
effective.Lifting
the construction of to Y one verifies that
Since
x*Op(d)(-E)
isnumerically
effective isdeter-mined
by
thegrowth
of forlarge
n.Pushing
backdown to
P,
a directcomputation
(see [N3])
then derives 2.8 from 2.9.We now pass to
Vojta’s
version ofDyson’s
lemma onproducts
of curvesof any genus. This is a
simple
and beautiful theorem whichinitially inspired
Vojta’s diophantine proof
of the Mordellconjecture
[V2]
and serves as akey ingredient
in thatproof.
Webegin by fixing
some notation. LetCi
and
C2
be smoothprojective
curves over analgebraically
closed field ofcharacteristic zero.
Suppose
L is a line bundle onCl
xC2.
LetF1
=Pi
xC2
be a fibre of the first
projection
andF2
=C1
xP2
a fibre of the secondprojection
forarbitrary points
PI
ECl
andP2
EC2.
Letdi
= L -F2
and
d2
= L -Fl
sod1
andd2
measure the’degrees’
of the line bundle L inanalogy
to themulti-degree
of ourauxiliary
polynomial
in Roth’s theorem.isomorphic
to theprojective
line. If s is a non-zero section of L it makes sense to talk about the index of s at apoint
x E01
xC2:
Definition 1.5applies
in thissituation,
withweights di, d2,
as it is a local definition. Welet the genus of
Ci
be gl and the genus ofC2
will be denoted 92.Theorem 2.10
(Vojta).
Suppose
there is a non-zero section 8 EH°(Cl
xC2, L).
moreover that arepoints in
Ci
xC2
such that no two are contained in a properproduct subvariety
and let =t~.
Write
Z(s) =
E~=1
aiZ¡
whereZi
is irreducible and letSeveral remarks are in order to
explain
thestrength
of Theorem 2.10.If,
as will be the case in the Mordellconjecture,
the genus of both01
andC2
is at least2,
then2g1 -
2 + M > 0 and one does not need to take the maximum. In thesecircumstances,
inequality
2.11 reads..- ..
Moreover,
we know that ed2 }
since any curveZi
which is not afibre of either
projection
satisfies 1 and 1. If we assumethat
d1
»d2,
as we did in Roth’stheorem,
this will kill the second termof 2.12
leaving
us with In the case we dealt withpreviously
where both
Cl
andCZ
had genus zero, we havecl (L)2
=2d1d2
and thisgives
the 1 on theright
hand side of ourprevious
Dyson
lemma. Onhigher
genus curves,
however,
there are more line bundles because thediagonal
inside C x C is
only linearly equivalent
to fibres of the twoprojections
when the genus of C is zero. Thus if one can choose a line bundle with
cl (L)2
« thenVojta’s
versionof Dyson’s
lemma says that no sectionof L can have
big
index at anypoint
of C x C. It isexactly
in this fashionthat
Dyson’s
lemma will beapplied
to prove the Mordellconjecture.
The
proof
of Theorem 2.10 isessentially
identical to that of Theorem2.2,
the
only
differencebeing
thattaking
derivatives isslightly
morecomplicated
on a curve of
larger
genus:this,
infact,
is what creates the2gi - 2 occurring
on the
right
hand side of 2.11. To see thesimilarity
between Theorem2.2 and Theorem 2.10 we consider first the
special
case of Theorem 2.10where both
Cl
andCZ
areprojective
lines. In this case, the theoremsappears on the left hand side of Theorem 2.2 while it has been moved to
the
right
hand side in Theorem 2.10.To see how our
proof
of Theorem 2.2 can beadapted
to this newsitu-ation,
suppose we make sure, whenconstructing
our intersectionproduct,
that the sections which we intersect all have rather than
settling
forindex ti
- 28as we did. In other
words,
afteridentifying s
witha
polynomial
f (X, Y)
on an affine open subset and thendifferentiating
,f (X,
Y),
we increase the indexby
multiplying
by
some other fixedpolyno-mial. Of course, since the
goal
of theproof
is to induct on the dimensionof
product
subvarieties,
thepolynomial
onemultiplies by
should ifpossible
have zeroes in a proper
product subvariety,
i.e. one shouldjust
add fibresof one of the two
projections.
Thus thesimplest
solution is toreplace
thederivatives
by
Indeed,
thepolynomials
in 2.13 still have index atleast ti
at as for all i andthe
possible
decrease in index incurredby
differentiating
has been remedied.The
only problem
is that thedegree
of thepolynomial
has been increased.To minimize the
increase,
note that one of thepoints,
say can be takento be
(oo,
oo)
withrespect
to the affine openpatch
where we differentiate.Then the derivatives do not affect the index of the
projectivization
off
at
~1.
Also,
since each derivative decreases thedegree
off,
we have someadditional room to twist in order to
get
a section of Inparticular,
taking C2
=(0, 0),
we note thatXi axi (f )
hasdegree
(di, d2)
and index> t2
at~2.
So with this modification 2.13gives
uspolynomials
ofbi-degree
(d,
+(M -
2)i, d2).
The number of derivatives which need to be taken in order for the common zeroes to be contained in a properproduct subvariety
is
by
definition the number e because is transverse to all non-fibral C and so thesepartial
derivativesalways
decrease the order ofvanishing
along
suchcomponents
ofZ(s).
Thus theconclusion,
if we runthrough
theargument
used to prove Theorem2.2,
is thatExpanding
2.14yields exactly
2.11 up to a factor of two on the second term:for a discussion of how to remove this factor of
2,
see[EV]
§10.
We now turn to
proving Vojta’s
version ofDyson’s
lemma in fullgener-ality.
In order to take derivatives of a section of L on01
xC2
we recallthe method used
by Faltings
([F1]
p. 560 and p.566)
in a moregeneral
derivatives of a section s E x
C2,
L)
is first that there are noglobal
vector 2 and second that after
identifying s
locally
with some
polynomial
anddifferentiating
thispolynomial,
the derivativesno
longer patch together
togive
a section of L. Toremedy
this wecon-sider the
following
construction.Suppose
we aregiven
a dominant mappl :
Ci -
Then one canpull
back the derivation onPl
via withpoles along
the ramification divisor R of pl and so obtain:Definition 2.15. Let s E
H°(Cl
xC2, L)
and suppose Z CZ(s).
Thenis the
global
section of +L(1riKcl)IZ
obtained
locally by differentiating s
via thepull-back
of the derivationon
Opi
by
thecomposition
pi - 1r1.Higher
order derivatives are definedanalogously
so for a > 0provided
allpartial
derivativesD/3 ( s)
withQ
a vanishidentically along
Z.
Using
Definition2.15,
one can derive Theorem 2.10 inexactly
the samemanner in which we
proved
Theorem 2.2. Theonly
difference is that when we take e derivatives of the section s we have to twistby
e7ri Kc¡.
Since wealso have to add one fibre for each of the
points ~~
in order to preserve theindex of the derivative of s at
(i
this means that the total twist we need inorder take the necessary e derivatives is
and,
sincedeg(Kcl )
=2g1 -
2 this isexactly
what occurs on theright
andside in 2.11. Thus
arguing exactly
as before onefinds,
for J =max2g
-2+M,0}
which is 2.11 up to a factor of 2 on the second term. We
already
saw thisfactor of 2 appear after 2.14 and to avoid it one makes a direct intersection
theoretic construction as in
[Vl]
or[EV]
§10.
Two further remarks are necessary in order to turn this sketch into a
proof. First,
one needs to be careful abouttwisting
the line bundle Lbecause this can make it difficult to
compute
deg R~
in2.8;
indeed theproof
iscohomological
in nature andrequires
that the sections constructedall be sections of the same line bundle. This
problem
is notserious,
however,
as we are
merely twisting by
fibres of the firstprojection
and so one cantwist all bundles involved without
adversely affecting
the construction. The secondpossible problem
is that onhigher
genus curves, the derivative ofnot as a section of a line bundle on
Cl
xC2.
Thistoo,
however,
plays
nocrucial role in the
proof
of 2.14(and
itsanalogue
onCl
xC2).
Again
theproof
iscohomological
in nature andonly
depends
on numericaleffectivity
of a line bundle on a blow up of
01
xC2
(see [N3]
fordetails)
and for this one does not needglobal
sections butjust
non-zero sections on any curvea c 01 x C2.
To conclude this
section,
it is worthpointing
out that Theorem 2.10 doesnot
hold,
asstated,
on aproduct
of three or more curves ofgenus >
1: inother
words,
the data of thedegrees
di
of L and thetop
intersection numberc1 (L)t°p
are not sufficient to bound the sum of the relevant volumes. The reason forthis, thinking
of theproof
of Theorem2.10,
is that theonly
proper
product
subvarieties X ofCl
xC2
arepoints
or fibres of the twoprojections.
In both cases, thedegrees
dl, d2
determine,
at least up tonumerical
equivalence,
the line bundle Inhigher
dimension this is nolonger
the case and we takeadvantage
ofprecisely
this fact inproducing
a
higher
dimensionalexample
where the directgeneralization
of Theorem2.10 fails.
Let C be a smooth
projective
curve of genus g > 1 and let Y = C x C.Let
Fl
C Y be a fibre of the firstprojection,
FZ
C Y a fibre of the secondprojection,
and A’ =tl - Fl - F2.
Finally,
let n =(nl, n2, n3)
be a3-tuple
of
integers.
We consider divisors of the formSince
F1
+F2
isample and Vn ~
(Fl
+F2 )
= nl +n2, some
multiple
ofVn
is effective
provided
nl + n2 > 0 andVri
> 0. We haveHence
given
any E > 0 there exist nl, n2, n3 such that nl » n2 »0, Yn
iseffective,
andLetX=CxCxCandfixapointPEC.
Let p : X -~ Y denote theprojection
to the first two factors and let x3 : X ~ C denote theprojection
to the third factor. Define
Choose
0 0 s
EH° (Y, OY (Vn) )
and let t EHO(C,Oc(P))
be the section whose divisor of zeroes is P. Then 1 for anypoint C
E XThus if Theorem 2.10 held for G one would find
(up
to ano(1)
depending
on the constantsimplied
in nj » n2 W0)
thatcontradicting
2.17.It should be noted that the
particular
divisorYn
which we chose abovewill
play a
central role inVojta’s proof
of the Mordellconjecture
to begiven
in the next section. As was notedabove, Vojta’s Dyson
lemma isextremely powerful
whenapplied
to a divisor likeYn
as it says that nosection of
H°(C
xC,
vn)
can havelarge
index at anypoint.
3. THE MORDELL CONJECTURE
In this section we will
give
a sketch ofVojta’s proof
[V2, V3]
of thefamous Mordell
conjecture
Theorem 3.1
(Faltings).
Suppose
C is a smoothprojective
curveof
genus> 2
defined
over a numberfield
k. Then C has at mostfinitely
manyk-rational
points.
Of course, Mordell
originally
conjectured
Theorem 3.1only
for the casewhere k =
Q.
This is in fact the case which we willultimately consider,
not because it is
intrinsically
easier than thegeneral
case but because thereis less notational
complication
asQ
admitsonly
oneembedding
in C. How does one useVojta’s
Theorem 2.10 to prove the Mordellconjecture?
Vojta’s
beautiful idea is to takeadvantage
of the freedom to choose the linebundle L in Theorem 2.10. If C is a curve of genus at least one then there are more line bundles on C x C than the
pull-backs
of bundles from thetwo
projections
because thediagonal
is notlinearly equivalent
to a sum offibres. This additional
parameter
of freedom allows for the choice of a linebundle L
(given
in factby
the divisorVn
at the end of the lastsection)
such that,
As was noted after the statement of Theorem
2.10,
if we also arrange fordi
»d2
then 2.12 tells us that no section s EHO(C
xC, L)
can havelarge
index at anypoint.
Thus all of the hard work wespent
in Roth’stheorem
trying
to show that theauxiliary
polynomial
P(Xl, ... , Xm)
has small index atp.,,Iq.,,,)
isaccomplished automatically, regardless
of our choice of section s.This sounds too
good
to be true and indeed it is. It will turn out that what was easy in the case of Roth’s theorem(namely
toproduce
anarith-metic reason