The Time Hierarchy Theorem - Time Hierarchies

More Resources, More Power?

4.2. Time Hierarchies and Gaps

4.2.1. Time Hierarchies

4.2.1.1. The Time Hierarchy Theorem

In the following theorem (which separates DTIME(t₁) from DTIME(t₂)), we refer to the model of two-tape Turing machines. In this case we obtain quite a tight hierarchy in terms of the relation betweent₁andt₂. We stress that, using the Cobham-Edmonds Thesis, this result yields (possibly less tight) hierarchy theorems for any reasonable and general model of computation.

Teaching note:The standard statement of Theorem4.3asserts thatfor any time-constructible function t2and every function t1such that t2=ω(t1logt1)and t1(n)>n it holds thatDTIME(t1) is strictly contained inDTIME(t2). The current version is only slightly weaker, but it allows a somewhat simpler and more intuitive proof. We comment on the proof of the standard version of Theorem4.3in a teaching note following the proof of the current version.

Theorem 4.3(time hierarchy for two-tape Turing machines): For any time-constructible function t₁ and every function t₂ such that t₂(n)≥(logt₁(n))²·t₁(n) and t₁(n)>n it holds thatDTIME(t₁)is strictly contained inDTIME(t₂).

As will become clear from the proof, an analogous result holds for any model in which a universal machine can emulatet steps of another machine inO(tlogt) time, where the constant in the O-notation depends on the emulated machine. Before proving Theorem4.3, we derive the following corollary.

Corollary 4.4(time hierarchy for any reasonable and general model): For any reasonable and general model of computation there exists a positive polynomial p such that for any time-computable function t₁ and every function t₂ such that t₂ > p(t₁)and t₁(n)>n it holds thatDTIME(t₁)is strictly contained inDTIME(t₂).

It follows that, for every such model and every polynomialt (such thatt(n)>n), there exist problems inP that are not in DTIME(t). It also follows thatP is a strict subset of E and even of “quasi-polynomial time” (i.e., DTIME(q), whereq(n)=exp(poly(logn)));

moreover, P is a strict subset of DTIME(q), for any super-polynomial function q (i.e., q(n)=n^ω(1)).

We comment that Corollary4.4can be proven directly (rather than by invoking The-orem4.3). This can be done by implementing the ideas that underlie the proof of Theo-rem4.3directly to the model of computation at hand (see Exercise 4.5). In fact, such a direct implementation, which is allowed “polynomial slackness” (i.e.,t₂> p(t₁)), is less cumbersome than the implementation presented in the proof of Theorem4.3where only a polylogarithmic factor is allowed in the slackness (i.e.,t₂≥O(t₁)). We also note that the separation result in Corollary4.4can be tightened – for details see Exercise4.6.

Proof of Corollary4.4: The underlying fact is that separation results regarding any reasonable and general model of computation can be “translated” to analogous results regarding any other such model. Such a translation may affect the time bounds as demonstrated next. Letting DTIME₂denote the classes that correspond to two-tape Turing machines (and recalling that DTIME denotes the classes that correspond to the alternative model), we note that DTIME(t₁)⊆DTIME₂(t₁) and DTIME₂(t₂)⊆ DTIME(t2), wheret₁ =poly(t1) andt₂ is defined such that t2(n)=poly(t₂(n)). The latter unspecified polynomials, hereafter denoted p1 and p2, respectively, are the ones guaranteed by the Cobham-Edmonds Thesis. Also, the hypothesis thatt₁ is time-constructible implies thatt₁ = p₁(t₁) is time-constructible with respect to the two-tape Turing machine model. Thus, for a suitable choice of the polynomial p (i.e., p(p₁⁻¹(m))≥ p₂(m²)), it holds that

t₂(n)= p⁻¹₂ (t₂(n))> p⁻¹₂ (p(t₁(n)))= p₂⁻¹pp⁻¹₁ (t₁(n))≥t₁(n)², where the first inequality holds by the corollary’s hypothesis (i.e.,t₂> p(t₁)) and the last inequality holds by the choice ofp. Invoking Theorem4.3(while noting that t₂(n)>t₁(n)²), we obtain the strict inclusion DTIME₂(t₁)⊂DTIME₂(t₂). Combining the latter with DTIME(t₁)⊆DTIME₂(t₁) and DTIME₂(t₂)⊆DTIME(t₂), the corollary follows.

Proof of Theorem4.3: The idea is constructing a Boolean function f such that all machines having time complexityt₁ fail to compute f. This is done by associating with each possible machine M a different input xM (e.g.,xM = M) and making sure that f(xM)=M(xM), where M(x) denotes an emulation of M(x) that is suspended after t1(|x|) steps. For example, we may define f(xM)=1−M(xM).

We note thatMis used instead ofMin order to allow for computing f in time that is related tot₁. The point is thatM may be an arbitrary machine that is associated with the inputx_M, and soMdoes not necessarily run in timet₁(but, by construction, the correspondingMdoes run in timet1).

Implementing the foregoing idea calls for an efficient association of machines to inputs as well as for a relatively efficient emulation of t1 steps of an arbitrary machine. As shown next, both requirements can be met easily. Actually, we are going to use a mappingµof inputs to machines (i.e.,µwill map the aforementionedx_M to M) such that each machine is in the range of µandµis very easy to compute

(e.g., indeed, for starters, assume that µis the identity mapping). Thus, by con-struction, f ∈DTIME(t1). The issue is presenting a relatively efficient algorithm for computing f, that is, showing that f ∈DTIME(t₂).

The algorithm for computing f as well as the definition of f (sketched in the first paragraph) are straightforward: On input x, the algorithm computes t = t₁(|x|), determines the machineM =µ(x) that corresponds tox(outputting a default value if no such machine exists),emulates M(x)for t steps, and returns the value 1−M(x). Recall that M(x) denotes the time-truncated emulation of M(x) (i.e., the emulation of M(x) suspended after t steps); that is, if M(x) halts within t steps then M(x)= M(x), and otherwise M(x) may be defined arbitrarily. Thus,

f(x)=1−M(x) ifM =µ(x) and (say) f(x)=0 otherwise.

In order to show that f ∈DTIME(t₁), we show that each machine of time com-plexityt1fails to compute f. Fixing any such machine,M, we consider an inputxM

such that M =µ(xM), where such an input exists becauseµis onto. Now, on the one hand,M(x_M)=M(x_M) (becauseMhas time complexityt₁), while on the other hand, f(x_M)=1−M(x_M) (by the definition of f). It follows thatM(x)= f(x).

We now turn to upper-bounding the time complexity of f by analyzing the time complexity of the foregoing algorithm that computes f. Using the time constructibil-ity oft1 and ignoring the easy computation ofµ, we focus on the question of how much time is required for emulatingt steps of machineM(on inputx). We should bear in mind that the time complexity of our algorithm needs to be analyzed in the two-tape Turing machine model, whereas M itself is a two-tape Turing machine.

We start by implementing our algorithm on a three-tape Turing machine, and next emulate this machine on a two-tape Turing machine.

The obvious implementation of our algorithm on a three-tape Turing machine uses two tapes for the emulation itself and designates the third tape for the actions of the emulation procedure (e.g., storing the code of the emulated machine and maintaining a step-counter). Thus, each step of the two-tape machineMis emulated usingO(|M|) steps on the three-tape machine.¹This also includes the amortized complexity of maintaining a step counter for the emulation (see Exercise4.7).

Next, we need to emulate the foregoing three-tape machine on a two-tape machine.

This is done by using the fact (cf., e.g., [123, Thm. 12.6]) thattsteps of a three-tape machine can be emulated on a two-tape machine inO(tlogt) steps. Thus, the com-plexity of computing f on inputxis upper-bounded byO(T_µ(x)(|x|) logT_µ(x)(|x|)), whereT_M(n)=O(|M| ·t₁(n)) represents the cost of emulatingt₁(n) steps of the two-tape machineM on a three-tape machine (as in the foregoing discussion).

It turns out that the quality of the separation result that we obtain depends on the choice of the mapping µ(of inputs to machines). Using the naive (identity) mapping (i.e.,µ(x)=x) we can only establish the theorem fort₂(n)=O(n·t₁(n)) rather thant₂(n)= O(t ₁(n)), because in this caseT_µ(x)(|x|)=O(|x| ·t₁(|x|)). (Note that, in this case,x_M = Mis a description ofµ(x_M)=M.) The theorem follows by associating the machineMwith the inputxM = M01^m, wherem=2^|^M^|; that is, we may use the mapping µ such that µ(x)=M ifx = M01²^|M| and µ(x) equals some fixed machine otherwise. In this case |µ(x)|<log₂|x|<logt₁(|x|) and soT_µ(x)(|x|)= O((logt₁(|x|))·t₁(|x|)). The theorem follows.

1This overhead accounts both for searching the code ofMfor the adequate action and for the effecting of this action (which may refer to a larger alphabet than the one used by the emulator).

Teaching note:Proving the standard version of Theorem4.3cannot be done by associating a sufficiently long inputxMwith each machineM, because this does not allow for geting rid of the additional unbounded factor inT_µ(x)(|x|) (i.e., the|µ(x)|factor that multipliest₁(|x|)). Note that the latter factor needs to be computable (at the very least) and thus cannot be accounted for by the genericω-notation that appears in the standard version (cf. [123, Thm. 12.9]). Instead, a different approach is taken (see footnote2).

Technical comments. The proof of Theorem4.3associates with each potential machine M some input x_M and defines the computational problem such that machine M errs on inputx_M. The association of machines with inputs is rather flexible: We can use any onto mapping of inputs to machines that is efficiently computable and sufficiently shrinking.

Specifically, in the proof, we used the mappingµsuch thatµ(x)=M ifx = M01²^|M|

andµ(x) equals some fixed machine otherwise. We comment that each machine can be made to err on infinitely many inputs by redefiningµsuch thatµ(x)=M ifM01²^|M|

is a suffix ofx (andµ(x) equals some fixed machine otherwise). We also comment that, in contrast to the proof of Theorem4.3, the proof of Theorem1.5utilizes a rigid mapping of inputs to machines (i.e., thereµ(x)= Mifx = M).

Digest: Diagonalization. The last comment highlights the fact that the proof of Theo-rem4.3is merely a sophisticated version of the proof of Theorem1.5. Both proofs refer to versions of the universal function, which in the case of the proof of Theorem 4.3is (implicitly) defined such that its value at (M,x) equals M(x), whereM(x) denotes an emulation ofM(x) that is suspended aftert1(|x|) steps.³Actually, both proofs refers to the

“diagonal” of the aforementioned function, which in the case of the proof of Theorem4.3 is only defined implicitly. That is, the value of thediagonal functionatx, denotedd(x), equals the value of the universal function at (µ(x),x). This is actually a definitional schema, as the choice of the functionµremains unspecified. Indeed, settingµ(x)=x corresponds to a “real” diagonal in the matrix depicting the universal function, but any other choice of a 1-1 mappingsµalso yields a “kind of diagonal” of the universal function.

Either way, the function f is defined such that for everyxit holds that f(x)=d(x). This guarantees that no machine of time complexityt₁ can compute f, and the focus is on presenting an algorithm that computes f (which, needless to say, has time complexity greater thant₁). Part of the proof of Theorem4.3is devoted to selectingµin a way that minimizes the time complexity of computing f, whereas in the proof of Theorem1.5we merely need to guarantee that f is computable.

Dans le document This page intentionally left blank (Page 156-159)