HAL Id: hal-00347376
https://hal.archives-ouvertes.fr/hal-00347376
Submitted on 15 Dec 2008
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Bruno Durand, Alexandre Zvonkine
To cite this version:
Bruno Durand, Alexandre Zvonkine. Kolmogorov complexity. É. Charpentier, A. Lesne, N. Nikolski.
Kolmogorov’s Heritage in Mathematics, Springer-Verlag, pp.281-299, 2007. �hal-00347376�
Kolmogorov Complexity
By Bruno Durand and Alexander Zvonkin
The term omplexity has dierent meanings in dierent ontexts. Computa-
tional omplexity measures how muh time or spae is needed to perform some
omputational task. On the other hand, the omplexity of desription (alled
also Kolmogorov omplexity) is the minimal number of information bits needed
todene (desribe)a given objet. It may well happen that a short desription
requiresa lotof time and spaeto followit and atuallyonstrut the desribed
objet. However, whenspeakingaboutKolmogorovomplexity,weusuallyignore
this problemand ountonly the desription bits.
Asitwas ommontohim, Kolmogorov published,in1965, ashort note[10℄ that
started a new line of researh. Aside from the formal denition of omplexity,
he has alsosuggested touse this notion inthe foundations of probabilitytheory.
Hisidea wasquite simple:
Anobjet israndom if ithas maximalpossible omplexity.
The denition of omplexity uses the notion of an algorithm; this unexpeted
marriage of two a priori distant domainsin our ase, probability theory and
theory of algorithmsisalsoa typialtrait of Kolmogorov'swork.
1.1 Algorithms
Thenotionofanalgorithminquitereent. In 1912(whenneitheromputersnor
programminglanguages existed)Émile Borel(see[19℄) usedthe phraseaformal
and preise automati rule desribing an objet whih we would now all an
algorithm.
(1)
However, a mathematialtheory of algorithmswasdeveloped only
in the 1930ies (by Turing, Gödel, Post, Churh, Kleene and others). The key
observation was the existene of a universal algorithm (see below); it allows to
prove easily that some problems (e.g., the so-alled halting problem that asks
whether a given algorithmterminates ona given input) are undeidable(annot
be solved by algorithms). Note that to prove the non-existene of an algorithm
that solves aertainproblemwe needa mathematiallypreisedenition ofthis
notion. Whenappeared,thisnotionbeameasubjetofthetheoryofalgorithms,
alsoalled theory of reursivefuntions ortheory of omputability.
The remainingpart of this setion disusses some aspets of the notion of algo-
rithm; the readernot interested inthese detailsmay skipitand proeed diretly
to Setion1.2.
It is ratherdiult togive a mathematialdenition thataptures the intuitive
idea of an algorithmin itsfull generality; instead, we may dene aspei lass
of algorithmsand laim that this lass is representative, i.e., that any algorithm
isequivalenttoaertainalgorithminthislass. (By theway,one oftheselasses
was suggested by Kolmogorov.)
1.1.1 Models of omputation
A modelof omputationformallydesribessome speilass ofalgorithms(the
lass of objets used as input/output data, how they are proessed, et.) Some
omputational models resemble programming languages while others look more
as ahardware desription. In any ase, we assumethat omputationalresoures
areunlimited(andforgetthatinrealprogramminglanguagesintegersareusually
bounded, proessor arhiteture has axed word length, et.).
(The study of resoures (time and spae) needed to solve a given problem is a
dierenteld alledomputational omplexity. Letusnotethatanimportantno-
tioninthiseld,NP-ompleteness,wasintroduedatthebeginningofthe1970ies
independentlyby threeresearhers, one of whom,LeonidLevin,is Kolmogorov's
student. Therstpubliationsby Levinwere aboutKolmogorovomplexity[21℄.
His short biography and a brief story how Kolmogorov inuened him may be
found inthe book[17℄.)
1
Thehistoryofthetermalgorithm isinterestinginitself. Thiswordisaderivativeofthe
nameofamedievalPersiansavantAl-Khw arizm(787 .850) whowastheauthorofabook
through whih the Europeans learned the positional number system and the rules of arith-
metioperations(addition,multipliation,et.). ThenameofAl-Khw arizm(whihmeansde
Khorezm, atown in Uzbekistan today alled Khiva) wastransliterated in Latin asAlgorith-
mus. The termalgorithms meantatthebeginningtherulesof fourarithmeti operations.
Thenbyextensionithasgotthemeaningofanysystematimethodofomputation. Leibnitz
alledalgorithms thesetofrulesofomputingdierentialsandintegrals. Itisonlygradually
that the word aquired its modern meaning; onehundred years agothis proess wasnot yet
Whihomputationalmodelisthe best one? This depends onour purposes. If
wewanttowriterealprograms,itisnaturaltousearealomputerandanappro-
priateprogramming language. On the other hand,if wewant to prove theorems
it would be more onvenient to work with an abstrat model of omputation;
a very simple model, with a small number of primitives, would then be better.
However, there is no anonial model adapted for proofs sine dierent models
are more suitablefor dierentresults.
ThemostpopularmodelisTuringmahine. Itisrathereasytoprovethe univer-
salityofthismodel;however, wehavetodealwithmanydetailsonerningtapes,
symbols, representation of the transition table, et. There are many versions of
Turing mahines; the most ommonone was, by the way, presented by Post and
not by Turing.
ReursivefuntionsàlaChurh giveamoremathematialandattrativemodel
thoughtheproofsofertainbasitheoremsbeomesomewhatdisouragingifnot
frightening.
Markov algorithmsare similarto rewriting systems for strings with termination
onditions; this is a model diult to manipulate (but well suited for the proof
of the undeidability of word problems).
The RAM (randomaess mahines) modelresembles von Neumann-style om-
puters...
Teahingthe algorithmstheory,one may hoose adierentapproah andnot x
anyspeimodelbutrelydiretlyontheintuitionofalgorithms. Moreformally,
itmeansthatwe have toaeptsomepropertiesof algorithmsused intheproofs
as axioms. Then we do not need to go into umbersome details of a spei
omputationalmodel; the prie is,however, that the listof axioms is open (e.g.,
if during the proof we need to establish the omputability of some funtion, we
just desribe informallyits omputation and then add a new axiom saying that
this funtion isomputable).
1.1.2 All models of omputation are equivalent
Why do we believe that this or that omputationalmodel orretly reets the
intuitive notion of an algorithm? This statement is usually alled the Churh
thesis (foragiven omputationmodel): itlaimsthat any omputablefuntion
(omputed by an algorithm in the informal sense) is omputable in this model.
This assertion is not a mathematial one; it is a belief onerning the notion of
intuitiveomputability. Ontheotherhand,weanprovethattheseassertionsfor
dierent omputation models are equivalent, sine it turns out that the lass of
omputablefuntionsisthe samefordierentexisting models(Turing mahines,
reursive funtions,et.).
Thenamegiventothethesis isratherinappropriate. Churhlaimedthatallin-
tuitivelyomputabletotal funtionsare omputableinhismodel. Alongontro-
equivalene theorem for two dierent models (reursive funtions à la Churh
and Turing mahines) was established by Turing in his seminal artile, and the
thesis in its most general form was formulated by Post. Therefore, a more ap-
propriate name would beChurhTuringPost thesis.
Allthis wasdone inthe 1930ies,sowhyKolmogorovmightwanttosuggestadif-
ferent omputationmodelin the1950ies? Hismotivationould bereonstruted
as follows. Though all omputationmodels mentioned above are equivalent, the
translationbetweenthem sometimesreplaes onestep inone modelby alongse-
quene of steps in another one. Forexample, anaddition may bean elementary
operation in some programming language while its implementation by Turing
mahine requires many steps.
Kolmogorov wanted to nd a model whose steps are elementary in the sense
thattheydonotallownaturaldeompositionintoasequene ofsimplersteps. On
the otherhand, he triedtond amost general(andnatural)modelamongthese
models. Thismeansthat elementarystepsofany othermodel(if they areindeed
elementary aording to our intuition) shouldnot require further deomposition
when translated into Kolmogorov's model.
1.1.3 KolmogorovUspensky mahines
The model suggested byKolmogorovwas lateralled KolmogorovUspenskyma-
hines. These mahines are not relatedto Kolmogorovomplexity, but they are
related toKolmogorov himself; hene we say a oupleof words about them.
Theonguration(stateoftheomputation)ofaKolmogorovUspenskymahine
is a graph; some node of this graph is delared to be ative. The program for
the mahineisa listofrules thatsay howthis ativepart shouldbetransformed
and when the proessing halts. So the omputation step is indeed loal; it
deals with a nite size neighborhoodof the ative node. On the other hand, the
topologialstruture ofthe omputationan beome ratherompliated. This
maybeonsideredasadisadvantageofthemodelsineitallowssomeationsthat
are hard toperform in a physial spae. (Forexample, a KolmogorovUspensky
mahineanreatealabeledtreethat providesanunreasonablyfastaesstoan
exponential amount of information.) So one may want to restrit somehow the
lass of allowed graphs [19, 8, 1℄. Later a version of this model was onsidered
by Shönhage (who used direted graphs with unlimited in-degrees). It seems
pertinent to mention here the development of the GASM (Gurevih Abstrat
State Mahines) whih were inspired by KolmogorovUspensky mahines but
have other goals and do not play a spei role in the lassial omputability
theory. The rst omplete desription of KolmogorovUspensky mahines may
1.1.4 Universality
Now weare austomedto the idea that the same proessor an be used toper-
form dierent tasks if provided with a suitable program. However, this idea of
universal omputation was a nontrivial and very important step in the devel-
opmentof the rst real omputers.
The same idea an be formally expressed as follows: there exists a universal
omputablefuntionU oftwoargumentspandx. Theuniversalitymeansthatwe
anobtain any omputablefuntionof xby xinganappropriate rstargument
p(a program for this funtion).
Why doesauniversal funtionexist? Imagineaninterpreter of anarbitrarypro-
gramminglanguagethatonsidersitsrst argumentpasaprogramandexeutes
this programusing x asits input.
1.1.5 Non-omputable funtions
The existene of a universal omputable funtion immediately brings us to a
paradox. Consider the funtion F(p) = U(p;p)+1. This (unary) funtion is
omputablesineU is. Itshouldthen haveaprogramassoiatedtoit(sineU is
universal); letus denote this program by q. What happens if we apply program
q toitself? Bydenition of U this givesU(q;q). On the other hand,sine q isa
programfor F, the same result must be equalto F(q) =U(q;q)+1. So we get
U(q;q)=F(q)=U(q;q)+1,and this seemsimpossible.
Theonly way toexplain this paradoxis toreall that ertainomputationsmay
neverterminate, so a programmay ompute a non-totalfuntion. And the on-
tradition disappears if U(q;q)is not dened.
A similar argument shows that the halting problem is undeidable: there is no
algorithmthat gets aprogrampand inputxand tellswhether U(p;x)is dened
(=whether the program pterminates on inputx).
1.1.6 Bak to algorithms
Returning topratie, let usnote that the notion of a omputablefuntion ap-
tures only one aspet of algorithmi pratie. For example, the behavior of a
real-time algorithm (suh as an operating system) is a more ompliated thing
than amere funtion. The hoie of a orretmathematialmodel for this lass
of algorithms(very importantfor pratie) is a wellstudied but not fully solved
problemof theoretial omputer siene.
1.2 Desriptions and sizes
Any informationmay be enoded as a bit string (a nite sequene of bits). For
Binary strings arealso alledwords in the alphabet B =f0;1g,and the set of all
binarystringsisdenotedasB
. WeidentifyB
withthesetZ +
n f0g=f1;2;3;:::g
usingthe lexiographiorder. (Theempty wordisassoiatedwith1,then07!2,
1 7! 3, 00 7! 4, 01 7! 5, et.: a string u is assoiated with a natural number
that has binary representation 1u. For example,the word 00 orresponds to the
number100
2
,i.e., 4.)
The length juj of a binary word u, i.e., the number of letters init, isthen equal
totheintegralpartbloguofthe binarylogarithmofthenumberassoiatedwith
u. (Note that juj stands for the length of the word u and not for the absolute
value of the orrespondinginteger.)
Denition 1.2.1. Let f : B
! B
be a omputable funtion. We dene the
omplexity of x2B
with respet to f as
K
f (x)=
minjtj suh that f(t)=x;
1 if suh t does not exist.
In other terms, we all desriptions of x (with respet to f) all strings t suh
that f(t)=x; thenthe omplexityK
f
(x)is dened asthe lengthof the shortest
desription.
The main problem with this denition is that the omplexity depends on the
hoie off. Itisunavoidable,but the theoremstated below(due toKolmogorov
but already present, in an informal way, in the paper of Solomono [18℄) ex-
plains in whih way this dependene an be limited. This theorem was later
independently proved by Chaitin but does not appear in his rst papers on the
subjet [2, 3℄the priority laims have provoked a long and futile ontroversy
explained in[13℄.
Theorem 1.2.1 (Existene of an optimal funtion). There exists a om-
putable funtion f
0
(alled optimal funtion) suh that for any other omputable
funtion f there exists a onstant C suh that
8x K
f0
(x)K
f
(x)+C: (1.2.1)
(Note that the onstantC may depend on f but not on x.)
Proof. Let t be a shortest desription of x with respet to f, i.e., f(t) = x.
Then f
0
uses as a desription of x the pair (p;t) where p is a program that
omputes thefuntion f. Inthis pair phas jpjbitsand t has jtjbits, sothe total
numberof bits is jpj+jtj, i.e., jpj+K
f
(x). So we letC =jpj.
Remark 1.2.1. This argument needs some renement. We annot use the pair
(p;t) diretly; we need to enode it by a single string. Not any enoding will
work. Anappropriateenodingmay enode pinaveryineientwaythisonly
inreases the onstantC. On the other hand, itis essential tobeable to enode
t without any loss of spae sine anenoding of t whih demands, say, jtjbits
with >1 leads tothe omplexity K (x)+C instead of K (x)+C.
Corollary 1.2.1. If f
1
and f
2
are two optimal funtions then there exists a
onstantC suh that
8x jK
f1
(x) K
f2
(x)jC: (1.2.2)
Proeeding from this orollary, we hoose some optimal funtion f
0
and x it.
The subsript f
0 in K
f
0
is then suppressed. However, after doing this we still
have in mind that in fat the Kolmogorov omplexity is dened only up to a
bounded additive term.
Denition 1.2.2. The Kolmogorov omplexity K(x) is the omplexity K
f
0 (x)
with respet to some optimal funtion f
0
. The omplexity K(x) is dened up to
a bounded additive term.
Proposition 1.2.1.
K(x)jxj+C; or, equivalently, K(x)logx+C: (1.2.3)
Proof. It sues toletf(x)=x in(1.2.1),i.e., touse xitself asadesription
of x.
Proposition1.2.2(Distributionofomplexities).Considerallbinarystrings
of length n. The fration of strings x of length n suh that K(x) < n k does
not exeed 2 k
.
Proof. Thenumberofstrings oflengthn is2 n
whilethe numberof(potential)
desriptionsof length less than n k is
1+2+:::+2 n k 1
<2 n k
:
There exist strings of length n whose omplexity is at least n (they are often
alled inompressible strings). Indeed, there are 2 n
strings of length n and at
most 1+2+:::+2 n 1
=2 n
1potential desriptionsof length less than n.
One may ask for an example of an inompressible string. However, it is not
possible to nd an inompressible string of length n eetively (having n as
input). Indeed, if it were possible, a string generated by this algorithm would
have omplexity logn+ sine we need to speify n (about logn bits) and the
algorithm itself (onstant number of bits), and logn + is less than n for all
suiently large n.
Inompressiblestringsareausefultoolintheoretialomputersiene(automata
theory, formallanguages, et.).
Today everybody uses software for data ompression and deompression; this
However, the Kolmogorov omplexity theory may still provide useful hints: for
example, if a software advertisement laims that a latest version of the super-
ompressor ompresses every leby a ertainfator, you better avoidthis prod-
ut.
Finally,toprepare forthe next setion(on Gödel's inompleteness theorem), we
present a variation on a well known theme of busy beavers. Initially the busy
beaver numbers were dened as follows. Consider Turing mahines that have at
most n statesandwhose tapealphabetonsistsoftwosymbols(say, blank and
stroke). We start suh a mahine on the blank tape. Some mahines do not
terminate atall. Forthe mahines thatterminate weount thenumberof steps;
let T(n) be the maximal number of steps among the terminating mahines with
at most n states.
Evidently, T(n) is an inreasing funtion of n sine we onsider all mahines
that have at most n states. It grows very fast; in fat, it grows faster that any
omputable funtion (does not have a omputable upper bound). Indeed, if a
omputableupperboundf(n)exists,itmaybeusedtosolvethehaltingproblem,
sineweknowthatifamahinewithn statesdoesnotterminateafterf(n)steps,
itwillneverterminate. Sonoomputablefuntion, even a fastgrowing one, like
n!
n!
n!
(n! levels), is anupper bound forT(n).
Buthereweonsideradierent(butrelated)fast-growingfuntion. Letusdene
Æ(n) as the biggest integer that has omplexity less than n. It exists sine the
numberofdesriptionsofsizelessthannisnite. Bydenitionwehaven K(x)
for any x > Æ(n), e.g., for x = Æ(n)+1. If the funtion Æ were omputable we
would have K(Æ(n) +1) logn + C sine n might serve as a desription of
Æ(n)+1. The ontradition is evident. Hene, Æ isnot omputable. In a similar
way we an prove that Æ grows faster than any omputablefuntion. (It sues
to replae Æ(n) in the preeding inequalities by any omputable upper bound
for Æ.)
1.3 Gödel's theorem
1.3.1 It is proved that one annot prove everything
The funtion K(x) is not omputable. How an we use it? For example, to
prove theorems. Maybe the most remarkable example is the proof of Gödel's
inompleteness theorem. Roughly speaking, this theorem laims that not allthe
truthsareprovable. Mathematishas itsintrinsilimits: thereexistpropositions
that are true but impossibleto prove.
We propose to you a more onrete form of a proposition that is true but
unprovable; itwas suggested by Gregory Chaitin[4℄.