Introduction to Computation Theory – Part One

(This article represents a brief divergence from our usual focus on distributed systems and algorithms. Normal service will be resumed shortly.)

One of the courses I do some teaching for is the dryly named ‘Theory of Computation’. Although such a course name is unlikely to pique a lot of interest, in actual fact it is probably the most fundamental and important course that you could take about computer science. Computation theory allows us to ask, and answer, questions about the limits of computers and the programs they execute. Is every problem solvable by a computer program? What do we even mean by a ‘problem’?

It turns out that there are profound and beautiful truths to be discovered when considering these questions. In a short series of articles, I’m going to give a brief introduction to the theory of computability, aimed at those who have never studied Turing machines, Cantor diagonalisation or Godel incompleteness. An appreciation for some of these ideas and results should give a perspective on what computers are, and are not, capable of.

This first article presents a bit of background material on the nature of infinity, which is right at the heart of computation theory. In the next article I’ll tie this in with the idea of computability and show how these ideas elegantly allow us to establish limits on what we can do with computers – even in theory, let alone practice.

Different Types of Infinity

A very important and fundamental question to computation theory is this: “how many unique computer programs are there?”

Although we’re not in a position to be precise about the details of such a question, we can at least give some thought to the kind of answers it might have. Are there just a finite number of computations – different executions with different results that a theoretical computer could perform?

It’s clear that this is unlikely. For example, we can conceive of a program that prints out a unique integer, and we can think of infinitely many programs that each print out a different integer. So it seems that there should be infinitely many programs, or computations. But let’s not be too hasty throwing that word ‘infinite’ around. What do we understand by infinity?

Let’s get one thing straight right away. We can’t say that ‘infinity’ is a quantity, or a number. We can’t add one to infinity. Infinity minus infinity is meaningless, not zero. Rather, infinity is a predicate on sets of things. If I describe to you a set, if we say that it does not contain a finite number of things then we say that the set has the property of being infinite. An infinite set is one that doesn’t have a number that we can call its size.

Bear with me a moment while I get pedantic. What does ‘size’ mean? What’s the size of a set? Obviously, the size of a set is the number of items in it. But how do we arrive at that number? We count the items, one by one. So what we do is give every item in the set a label in order ‘1’, ‘2’, ‘3’… and so on until all the elements in the set have a label. The value of the last label that we assigned is what we call the size, or cardinality of a set. It’s the number of labels we need to uniquely label every element in the set. If a set is infinite, there is no finite number of labels that we can use to uniquely label every element. No matter what number of labels we use, there will always be elements (in fact, infinitely many of them) left over at the end that didn’t get a label. This is what infinite means. If we take a finite number of things out of an infinite set the set will never be empty, no matter how many we take out.

An example of an infinite set is the set of all words that are some number of As followed by the same number of Bs. For example, AB, AABB, AAABBB are all in the set. There are infinitely many words like this. How do we prove that a set is infinite? One way to do it is to take a set that we already know is infinite and show that for every element in our known set there is exactly one element in our candidate infinite set. This is similar to our labelling technique for finite sets – we take an infinite set and show that we can use it to label our candidate set. Since there are infinitely many labels from our infinite set, and we have shown that we need all of them to label our candidate set, we can conclude that our candidate set must itself be infinite, otherwise we would only need a finite number of labels to label it. By coming up with a labelling that uses every element of the infinite set, no matter what it is, we show that no finite labelling can exist.

So how do we apply this technique to our A…B… set? Well, we take a known infinite set, in our case the natural numbers N={1,2,3,…} and find a labelling. Here the labelling is easy to find. Let’s write every element of A as A^nB^n where A^n means ‘repeat A n times’. Then there is exactly one word in A for every positive integer n (negative values of n don’t make a lot of sense). This makes it very easy to come up with a labelling from N to A. A word A^nB^n in A has a label n in N. For every natural number, there is exactly one word. Therefore A must be infinite!

This technique is the most useful when it comes to reasoning about the cardinality of various sets, and is hugely important when talking about computability. Computer scientists will call the act of finding a labelling establishing a bijection between A and N. A bijection is simply a precise word for a relationship that pairs elements of two sets together such that every element of each set has a unique partner in the other.

Now, we have taken as given that the set of integers is infinite, and no-one would argue with us. Is that the end of the story? Do all infinite sets have a bijection with the integers?

It turns out that the answer is no, not all infinite sets can be placed in correspondence with the integers. This result is extremely famous, and due to Georg Cantor, a man who spent much of his mathematical career looking into infinity and drove himself mad as a consequence.

‘Smaller’, But Still The Same Size

Let’s consider two different sets and challenge our intuitions about their size as compared to the integers. Firstly, consider the even integers. Is the set of even integers infinite? We would think so, because no matter what finite collection of them we can put together, we can always think of one that we’ve left out. This is a giveaway property of an infinite set. But at the same time, it seems like there should be fewer even integers than all the integers combined – intuitively we would expect there to be twice as many odd and even integers as just the even ones. But ‘twice as many’ is a murky concept when dealing with infinity, which as we discussed earlier doesn’t yield to normal arithmetic.

Indeed, an attempt to find a bijection between even integers and the whole set of integers turns up a correspondence pretty quickly. Recall that we can write every even integer in the form 2x for all x\in \mathbb{N}. This trivially gives us our bijection – label every even integer with the integer you get by dividing it by two. Conversely, every integer has a corresponding even integer that is the result of multiplying the integer by two. So 1 maps to 2, 7 maps to 14 and 316 maps to 632.

So it turns out that, completely counter-intuitively, there are as many integers as there are both even and odd integers! Infinity does not always behave as we expect. In particular it’s not sufficient to show that one set is a proper subset of another in order to demonstrate their different sizes when both sets are infinite.

Bigger Than Infinite?

I said earlier that there are infinite sets that are not of the same size as the integers. We’ve tried to find a set that was smaller, but had no luck. Let’s consider an apparently larger set and see where we get.

The real numbers are a good candidate. The real numbers are fairly complex to formally define, but we know that they include all the integers, as well as all the numbers with decimal expansions between the integers. Indeed, some real numbers have expansions that are infinitely long – a fact that will prove very important when we talk about the limits of Turing machines.

Is there a bijection between real numbers and integers? This is a tricky question, and required a genuine bit of ingenuity to resolve. The problem is that when a bijection is not obviously forthcoming (it’s hard to write each real number in terms of a single integer) we have to look in the other direction and prove that there is no such bijection. This is a good deal harder – a positive proof just requires us to come up with a single exemplar, whereas a negative proof has to be valid for every single possible bijection!

Cantor’s attack on the negative proof went as follows. Imagine there is such a bijection. Then we could conceive of a list with all the real numbers in order, as labelled by their integer counterparts. So at the top would be real number 1, followed by real number 2 and so on. Obviously, the list would be infinite so we couldn’t write it down. That’s no problem, we just have to imagine it.

Note at this point how the list doesn’t depend at all on the nature of the bijection itself – no matter what the bijection is, we can construct this list. This generality is crucial for easily dismissing the entire set of bijections at once.

The next step is the neat one. If this list was a bijection, it would contain every single real number. So if we could come up with some real number that wasn’t in the list, we would show that it was not a bijection. Cantor’s great insight was to see that there is a way to construct such a real number by showing that it could not be equal to any real number in the list.

Cantor imagined the list of real numbers as a table, with the rows being real numbers, and the columns being the individual digits of each real number. So the 3rd column would contain the 3rd digits of every real number. Therefore we can identify the j^{th} digit from the i^{th} real number by talking about the digit d_{ij}.

To keep the presentation straightforward, we’re going to consider only the real numbers between 0 and 1 – if there are more of even these kinds of number than real numbers, we’ll still have established the result.

Digit 1 Digit 2 Digit 3 Digit 4
r_1 0. 7 3 9 4
r_2 0. 9 3 2 8
r_3 0. 4 2 9 8

This table is the first few rows of an example bijection.

Now the key observation here is that, for a real number r_p to be different from r_q it is enough for r_p to differ from r_q at just one digit. That is, as long as there exists a k for which d_{pk} \neq d_{qk} then r_p \neq r_q.

So to find a real number that is not in the list, we just have to find a way to construct one that is different from every real number in the list at at least one digit.

Obviously it’s no good just changing a single digit – say we set the 3rd digit of our real number to 7 – because it’s guaranteed that there are some real numbers whose 3rd digit is 7 and we don’t know whether or not the rest of our real number matches them.

However, just as there are an infinite number of real numbers in our list, there are infinitely many columns in our table – so infinitely many digits to play with. So if we just make sure that each digit is different from its counterpart in one real number each, that will be sufficient because there are enough digits to go around. We will construct a real number C=c_1c_2c_3\hdots so that each digit c_k of C is different from digit d_{ik} of the i^{th} real number in the list, and that i is different for all c_k.

There are two challenges here. The first is to come up with a method that lets us always pick a digit that is different. The second is to come up with a mapping – in fact, another bijection – between the columns and the rows so that every digit in our constructed real number is responsible for being different from exactly one real number in the table.

Both challenges are easily resolved. If the digit we wish to be different from is x, then it’s simple to construct y \neq x by adding one to x (and if x is 9, wrapping y around to 0).

We can also neatly deal with the second challenge. In our constructed real number C, we’ll give digit c_1 the responsibility of being different from the 1st real number in the list, at digit d_{11}. Similarly for c_2 \neq d_{22}. And so on. Every digit of C causes it to be different from a unique real number in the list. By combining all these digits together, we get a new, infinitely long real number that is different from every single listed real number.

Digit 1 Digit 2 Digit 3 Digit 4
r_1 0. 7 3 9 4
r_2 0. 9 3 2 8
r_3 0. 4 2 9 8

In the case of our example bijection, we would have the beginnings of C as 0.840… – different from 0.7394… at digit 1, different from 0.9328… at digit 2 and different from 0.4298… at digit 3. Different from every real number in the table!

This concludes the proof. What we have shown is that there is no bijection between the real numbers and the integers. No matter what the bijection, we can come up with some real number that’s not been included. Therefore we can conclude that there are more real numbers than integers.

The shape that the ‘differing digits’ make in the table of real numbers moves along the diagonal. This is why this is called ‘Cantor’s diagonal proof’. In actual fact, the diagonal pattern is just one of infinitely many bijections we could have come up with. It is the simplest, and the easiest to understand.

Two Kinds of Infinity

So this leads us to the strange idea that there are two different kinds of infinity. The first kind of infinite sets, like the integers, are called countably infinite to emphasise the idea that they can be placed in one-to-one correspondence with the integers, an action that we cal ‘counting’. The second kind of infinite sets that we have discovered, like the real numbers, are called uncountably infinite by way of contrast. Uncountably infinite sets are unimaginably more vast than countable ones. Between any two real numbers, there are uncountably infinite intermediate real numbers – more than there are integers. There are more real numbers between 0 and 1 than there are integers.

This contrast will play a significant role in the development of a theory of computation. We will be able to identify the number of possible computations with a countably infinite set, and therefore describe, using our established language of infinity, computations that our best efforts cannot perform, not even with all the resources in the world.

4 thoughts on “Introduction to Computation Theory – Part One

  1. Enjoyed very much what you wrote! Will be eagerly looking forward to the rest in the series.

  2. Well, even I did not said it when you published this article, I too would like to see a part two. It is hard to wait.

Comments are closed.