Using Kolmogorov complexity to solve the Halting problem

We assume that reader is familiar with the notions of undecidability, Turing reductions, Kolmogorov complexity, Halting problem, and related subjects.

The (easy) proof that the uncomputability of Kolmogorov complexity implies the undecidability of the Halting problem can be found in many lectures notes and books; usually the proof assumes that the Halting problem is decidable and derive the  computability of Kolmogorov complexity which is a contradiction. In other words given an oracle for the Halting problem, we can compute the Kolmogorov complexity of a string $x$.

But we can also derive the uncomputability of Kolmogorov complexity from the undecidability of the Halting problem; the proof is “less popular” but nevertheless can be found after a few searches on Google. For example the technical report: Gregory J. Chaitin, Asat Arslanov, Cristian Calude: Program-size Complexity Computes the Halting Problem. Bulletin of the EATCS 57 (1995) contains two different proofs, and the great book Li, Ming, Vitányi, Paul M.B.; An Introduction to Kolmogorov Complexity and Its Applications presents it as an exercise (with a hint on how to solve it that is credited to P. Gács by W. Gasarch in a personal communication Feb 13, 1992). Here we give an extended proof, with more details, that the Halting problem can be decided using an oracle that computes the Kolmogorov complexity of a string, i.e. that the Halting problem is Turing reducible to the Kolmogorov complexity.

Using Kolmogorov complexity to solve the Halting problem

The Halting problem is represented using the halting set:

$$HALT = \{ \langle M,x \rangle \mid M \text{ is a Turing machine that halts on input } x\}$$

Let $T_{max}(n)$ be the longest running time among the Turing machines of size $n$ that halt on empty tape; and let $BB_n$ be the Turing machine that achieves $T_{max}(n)$ (note that $T_{max}$ is simply a variant of the Busy Beaver function in which the score is the number of steps required to halt) .

Starting from the given pair $\langle M, x \rangle$, we can build $M’$ that on empty tape writes $x$ on it and simulate $M(x)$, let $|M’| =n$. Clearly if $M’$ of size $n$ is still running after $T_{max}(n)$ steps then it will run forever.

For large enough $n$, it there exists another machine $Z$ of size $n+c < 2n$ that “embeds” the machine $BB_n$ and:

  • calculates $T_{max}(n)$ simulating $BB_n$;
  • then enumerates all Turing machines $M_1, M_2,…$ of size $< 2n$ and simulate them on empty tape;
  • if $M_i$ halts before $T_{max}(n)$ then $Z$ keeps track of the string $y_i$ left on tape and its length  $l_i$; otherwise it continues with the next machine;
  • $Z$ halts when all the $M_i$s of size $< 2n$ have been scanned;
  • $Z$ outputs one of the string of length $2n$ that hasn’t been generated during the process; at least one of them exists by the pigeonhole principle (there are only $2^n-1$ progrms of length less than $2n$).

Note that we are not able to actually build $Z$ (because we don’t know $BB_n$), but it exists.

Now suppose that Kolmogorov complexity $C(x)$ of a string $x$ is computable, then for every string  $y_i$ of length $l$ we can actually find the shortest Turing machine that generates it:  simply dovetail the excution of all the Turing machines of length $C(y_i)$ until one of them halts with $y_i$ on the output tape.

So we can find all the strings $y_i$ of length $2n$ that has $C(y_i) < 2n$ and the corresponding Turing machines $M_i$  ($|M_i| < 2n$) that generate them; we can also calculate the time $t_i$ needed for $M_i$ to generate $y_i$.

If $T = \max \{ t_i \}$ then we must have that $T > T_{max}(n)$ otherwise $Z$ above is able to simulate all the programs of size $< 2n$ that generate a string of length $2n$ until they halt, and record their outputs; so, by construction, its output is a string $y$ of length $2n$ with $C(y) = 2n$ (i.e. uncompressible), but $|Z| = n + c < 2n$ and this is a contradiction because we have a program of size less than $2n$ which outputs a string whose Kolmogorov complexity is $2n$.

So $T$ is an upper bound for $T_{max}(n)$ and can be used to check if $M’$ halts and thus decide $\langle M, x \rangle \in HALT$.

$\Box$

Conclusion

The two reductions, from the Halting problem to the Kolmogorov complexity and from the Kolmogorov complexity to the Halting problem imply that the two problems are Turing equivalent (belongs to the same Turing degree).

 

3 thoughts on “Using Kolmogorov complexity to solve the Halting problem

  1. I really appreciate for writing this article.

    I am good with every part of the article except with the

    ” If $T=\max\{t_i\}$ then we must have that $T>T_{max}(n)$ otherwise $Z$ above is able to simulate all the programs of …… ”

    I am confused here !!

    even if $T> T_{max}(n)$ (or $T < T_{max}(n)$ ) $Z$ can able to simulate all the programs of size $<2n$ that generate a string of length 2n until they halt,......

    Am i correct with this view ??

    Can you please elaborate this para.

    Thanks & Regards.

    • No, by definition, $Z$ is able to simulate all programs of size $n$ (because it simulates them up the number of steps taken by $BB_n$ to halt). It is able to simulate all programs of size $< 2n$ *that generate a string of length $2n$* (this is important) only if $T > T_{max}(n)$; in other words if $BB_n$ runs for more steps than all programs of size $< 2n$ that generate a string of length $2n$; but if this is true we have a contradiction: because we are able to generate a string $y$ of length $2n$ that has Kolmogorov complexity $2n$ ($C(y) = 2n$), but using a program ($Z$) that has size strictly less than $2n$; a contradiction. Write me an email if you have more doubts.

  2. Really helpful post. If your time permits you would you do a post about Godels incompleteness theorem from the perspective of Turing machines ? I would love to read that.

Leave a Reply

Your email address will not be published. Required fields are marked *