theory of computation

Theory of computation 1
Automata theory 5
Regular language 10
Deterministic finite automaton 14
Context-free language 18
Pushdown automaton 20
Recursively enumerable set 25
Turing machine 27
Undecidable problem 43

Theory of computation 1

Theory of computationIn theoretical computer science and mathematics, the theory of computation is the branch that deals with whetherand how efficiently problems can be solved on a model of computation, using an algorithm. The field is divided intothree major branches: automata theory, computability theory and computational complexity theory.[1]

In order to perform a rigorous study of computation, computer scientists work with a mathematical abstraction ofcomputers called a model of computation. There are several models in use, but the most commonly examined is theTuring machine. Computer scientists study the Turing machine because it is simple to formulate, can be analyzedand used to prove results, and because it represents what many consider the most powerful possible "reasonable"model of computation (see Church–Turing thesis). It might seem that the potentially infinite memory capacity is anunrealizable attribute, but any decidable problem solved by a Turing machine will always require only a finiteamount of memory. So in principle, any problem that can be solved (decided) by a Turing machine can be solved bya computer that has a bounded amount of memory.

Relationship between computability theory, complexity theory and formal languagetheory.


The theory of computation can beconsidered the creation of models ofall kinds in the field of computerscience. Therefore mathematics andlogic are used. In the last century itbecame an independent academicdiscipline and was separated frommathematics.Some pioneers of the theory ofcomputation were Alonzo Church,Alan Turing, Stephen Kleene, John von Neumann and Claude Shannon.


Automata theoryAutomata theory is the study of abstract machines (or more appropriately, abstract 'mathematical' machines orsystems) and the computational problems that can be solved using these machines. These abstract machines arecalled automata. Automata comes from the Greek word (Αυτόματα) which means that something is doing somethingby itself. Automata theory is also closely related to formal language theory, as the automata are often classified bythe class of formal languages they are able to recognize. An automaton can be a finite representation of a formallanguage that may be an infinite set.

Theory of computation 2

Computability theoryComputability theory deals primarily with the question of the extent to which a problem is solvable on a computer.The statement that the halting problem cannot be solved by a Turing machine is one of the most important results incomputability theory, as it is an example of a concrete problem that is both easy to formulate and impossible to solveusing a Turing machine. Much of computability theory builds on the halting problem result.Another important step in computability theory was Rice's theorem, which states that for all non-trivial properties ofpartial functions, it is undecidable whether a Turing machine computes a partial function with that property.Computability theory is closely related to the branch of mathematical logic called recursion theory, which removesthe restriction of studying only models of computation which are reducible to the Turing model. Manymathematicians and computational theorists who study recursion theory will refer to it as computability theory.

Computational complexity theoryComplexity theory considers not only whether a problem can be solved at all on a computer, but also how efficientlythe problem can be solved. Two major aspects are considered: time complexity and space complexity, which arerespectively how many steps does it take to perform a computation, and how much memory is required to performthat computation.In order to analyze how much time and space a given algorithm requires, computer scientists express the time orspace required to solve the problem as a function of the size of the input problem. For example, finding a particularnumber in a long list of numbers becomes harder as the list of numbers grows larger. If we say there are n numbersin the list, then if the list is not sorted or indexed in any way we may have to look at every number in order to findthe number we're seeking. We thus say that in order to solve this problem, the computer needs to perform a numberof steps that grows linearly in the size of the problem.To simplify this problem, computer scientists have adopted Big O notation, which allows functions to be comparedin a way that ensures that particular aspects of a machine's construction do not need to be considered, but rather onlythe asymptotic behavior as problems become large. So in our previous example we might say that the problemrequires steps to solve.Perhaps the most important open problem in all of computer science is the question of whether a certain broad classof problems denoted NP can be solved efficiently. This is discussed further at Complexity classes P and NP.

Models of computationAside from a Turing machine, other equivalent (See: Church–Turing thesis) models of computation are in use.Lambda calculus

A computation consists of an initial lambda expression (or two if you want to separate the function and itsinput) plus a finite sequence of lambda terms, each deduced from the preceding term by one application ofBeta reduction.

Combinatory logicis a concept which has many similarities to -calculus, but also important differences exist (e.g. fixed pointcombinator Y has normal form in combinatory logic but not in -calculus). Combinatory logic wasdeveloped with great ambitions: understanding the nature of paradoxes, making foundations of mathematicsmore economic (conceptually), eliminating the notion of variables (thus clarifying their role in mathematics).

mu-recursive functionsa computation consists of a mu-recursive function, i.e. its defining sequence, any input value(s) and a sequence of recursive functions appearing in the defining sequence with inputs and outputs. Thus, if in the defining sequence of a recursive function the functions and appear, then terms of the form

Theory of computation 3

'g(5)=7' or 'h(3,2)=10' might appear. Each entry in this sequence needs to be an application of a basic functionor follow from the entries above by using composition, primitive recursion or mu recursion. For instance if ,then for 'f(5)=3' to appear, terms like 'g(5)=6' and 'h(5,6)=3' must occur above. The computation terminatesonly if the final term gives the value of the recursive function applied to the inputs.

Markov algorithma string rewriting system that uses grammar-like rules to operate on strings of symbols.

Register machineis a theoretically interesting idealization of a computer. There are several variants. In most of them, eachregister can hold a natural number (of unlimited size), and the instructions are simple (and few in number), e.g.only decrementation (combined with conditional jump) and incrementation exist (and halting). The lack of theinfinite (or dynamically growing) external store (seen at Turing machines) can be understood by replacing itsrole with Gödel numbering techniques: the fact that each register holds a natural number allows the possibilityof representing a complicated thing (e.g. a sequence, or a matrix etc.) by an appropriate huge natural number— unambiguity of both representation and interpretation can be established by number theoretical foundationsof these techniques.

P′′Like Turing machines, P′′ uses an infinite tape of symbols (without random access), and a rather minimalisticset of instructions. But these instructions are very different, thus, unlike Turing machines, P′′ does not need tomaintain a distinct state, because all “memory-like” functionality can be provided only by the tape. Instead ofrewriting the current symbol, it can perform a modular arithmetic incrementation on it. P′′ has also a pair ofinstructions for a cycle, inspecting the blank symbol. Despite its minimalistic nature, it has become theparental formal language of an implemented and (for entertainment) used programming language calledBrainfuck.

In addition to the general computational models, some simpler computational models are useful for special, restrictedapplications. Regular expressions, for example, specify string patterns in many contexts, from office productivitysoftware to programming languages. Another formalism mathematically equivalent to regular expressions, Finiteautomata are used in circuit design and in some kinds of problem-solving. Context-free grammars specifyprogramming language syntax. Non-deterministic pushdown automata are another formalism equivalent tocontext-free grammars. Primitive recursive functions are a defined subclass of the recursive functions.Different models of computation have the ability to do different tasks. One way to measure the power of acomputational model is to study the class of formal languages that the model can generate; in such a way to theChomsky hierarchy of languages is obtained.

Theory of computation 4

Automata theory 5

Automata theory

An example of an automaton. The study of the mathematicalproperties of such automata is automata theory

In theoretical computer science, automata theory isthe study of mathematical objects called abstractmachines or automata and the computational problemsthat can be solved using them. Automata comes fromthe Greek word αὐτόματα meaning "self-acting".

The figure at right illustrates a finite state machine,which belongs to one well-known variety of automaton.This automaton consists of states (represented in thefigure by circles), and transitions (represented byarrows). As the automaton sees a symbol of input, itmakes a transition (or jump) to another state, accordingto its transition function (which takes the current stateand the recent symbol as its inputs).

Automata theory is also closely related to formal language theory. An automaton is a finite representation of a formallanguage that may be an infinite set. Automata are often classified by the class of formal languages they are able torecognize.Automata play a major role in theory of computation, compiler design, parsing and formal verification.

AutomataFollowing is an introductory definition of one type of automaton, which attempts to help one grasp the essentialconcepts involved in automata theory.

Informal descriptionAn automaton is supposed to run on some given sequence of inputs in discrete time steps. At each time step, anautomaton gets one input that is picked up from a set of symbols or letters, which is called an alphabet. At any time,the symbols so far fed to the automaton as input form a finite sequence of symbols, which is called a word. Anautomaton contains a finite set of states. At each instance in time of some run, the automaton is in one of its states.At each time step when the automaton reads a symbol, it jumps or transits to a next state that is decided by a functionthat takes current state and the symbol currently read as parameters. This function is called transition function. Theautomaton reads the symbols of the input word one after another and transits from state to state according to thetransition function, until the word is read completely. Once the input word has been read, the automaton is said tohave been stopped and the state at which automaton has stopped is called final state. Depending on the final state, it'ssaid that the automaton either accepts or rejects an input word. There is a subset of states of the automaton, which isdefined as the set of accepting states. If the final state is an accepting state, then the automaton accepts the word.Otherwise, the word is rejected. The set of all the words accepted by an automaton is called the language recognizedby the automaton.In short, an automaton is a mathematical object that takes a word as input and decides either to accept it or reject it.Since all computational problems are reducible into the accept/reject question on words (all problem instances can berepresented in a finite length of symbols), automata theory plays a crucial role in computational theory.

Automata theory 6

Formal definitionAutomaton

An automaton is represented formally by a 5-tuple (Q,Σ,δ,q0,F), where:

• Q is a finite set of states.• Σ is a finite set of symbols, called the alphabet of the automaton.• δ is the transition function, that is, δ: Q × Σ → Q.• q0 is the start state, that is, the state of the automaton before any input has been processed, where q0∈ Q.• F is a set of states of Q (i.e. F⊆Q) called accept states.

Input wordAn automaton reads a finite string of symbols a1,a2,...., an , where ai ∈ Σ, which is called an input word. Theset of all words is denoted by Σ*.

RunA sequence of states q0,q1,q2,...., qn, where qi ∈ Q such that q0 is the start state and qi = δ(qi-1,ai) for 0 < i ≤ n,is a run of the automaton on an input word w = a1,a2,...., an ∈ Σ*. In other words, at first the automaton is atthe start state q0, and then the automaton reads symbols of the input word in sequence. When the automatonreads symbol ai it jumps to state qi = δ(qi-1,ai). qn is said to be the final state of the run.

Accepting wordA word w ∈ Σ* is accepted by the automaton if qn ∈ F.

Recognized languageAn automaton can recognize a formal language. The language L ⊆ Σ* recognized by an automaton is the set ofall the words that are accepted by the automaton.

Recognizable languagesThe recognizable languages are the set of languages that are recognized by some automaton. For the abovedefinition of automata the recognizable languages are regular languages. For different definitions of automata,the recognizable languages are different.

Variant definitions of automataAutomata are defined to study useful machines under mathematical formalism. So, the definition of an automaton isopen to variations according to the "real world machine", which we want to model using the automaton. People havestudied many variations of automata. The most standard variant, which is described above, is called a deterministicfinite automaton. The following are some popular variations in the definition of different components of automata.Input• Finite input: An automaton that accepts only finite sequence of symbols. The above introductory definition only

encompasses finite words.• Infinite input: An automaton that accepts infinite words (ω-words). Such automata are called ω-automata.• Tree word input: The input may be a tree of symbols instead of sequence of symbols. In this case after reading

each symbol, the automaton reads all the successor symbols in the input tree. It is said that the automaton makesone copy of itself for each successor and each such copy starts running on one of the successor symbol from thestate according to the transition relation of the automaton. Such an automaton is called tree automaton.

• Infinite tree input : The two extensions above can be combined, so the automaton reads a tree structure with(in)finite branches. Such an automaton is called infinite tree automaton


Automata theory 7

• Finite states: An automaton that contains only a finite number of states. The above introductory definitiondescribes automata with finite numbers of states.

• Infinite states: An automaton that may not have a finite number of states, or even a countable number of states.For example, the quantum finite automaton or topological automaton has uncountable infinity of states.

• Stack memory: An automaton may also contain some extra memory in the form of a stack in which symbols canbe pushed and popped. This kind of automaton is called a pushdown automaton

Transition function• Deterministic: For a given current state and an input symbol, if an automaton can only jump to one and only one

state then it is a deterministic automaton.• Nondeterministic: An automaton that, after reading an input symbol, may jump into any of a number of states, as

licensed by its transition relation. Notice that the term transition function is replaced by transition relation: Theautomaton non-deterministically decides to jump into one of the allowed choices. Such automata are callednondeterministic automata.

• Alternation: This idea is quite similar to tree automaton, but orthogonal. The automaton may run its multiplecopies on the same next read symbol. Such automata are called alternating automata. Acceptance condition mustsatisfy all runs of such copies to accept the input.

Acceptance condition• Acceptance of finite words: Same as described in the informal definition above.• Acceptance of infinite words: an omega automaton cannot have final states, as infinite words never terminate.

Rather, acceptance of the word is decided by looking at the infinite sequence of visited states during the run.• Probabilistic acceptance: An automaton need not strictly accept or reject an input. It may accept the input with

some probability between zero and one. For example, quantum finite automaton, geometric automaton and metricautomaton have probabilistic acceptance.

Different combinations of the above variations produce many classes of automaton.

Automata theoryAutomata theory is a subject matter that studies properties of various types of automata. For example, the followingquestions are studied about a given type of automata.•• Which class of formal languages is recognizable by some type of automata? (Recognizable languages)• Are certain automata closed under union, intersection, or complementation of formal languages? (Closure

properties)•• How much is a type of automata expressive in terms of recognizing class of formal languages? And, their relative

expressive power? (Language Hierarchy)Automata theory also studies if there exist any effective algorithm or not to solve problems similar to the followinglist.•• Does an automaton accept any input word? (emptiness checking)•• Is it possible to transform a given non-deterministic automaton into deterministic automaton without changing the

recognizable language? (Determinization)• For a given formal language, what is the smallest automaton that recognizes it? (Minimization).

Automata theory 8

Classes of automataThe following is an incomplete list of types of automata.

Automaton Recognizable language

Nondeterministic/Deterministic Finite state machine (FSM) regular languages

Deterministic pushdown automaton (DPDA) deterministic context-free languages

Pushdown automaton (PDA) context-free languages

Linear bounded automaton (LBA) context-sensitive languages

Turing machine recursively enumerable languages

Deterministic Büchi automaton ω-limit languages

Nondeterministic Büchi automaton ω-regular languages

Rabin automaton, Streett automaton, Parity automaton, Muller automaton ω-regular languages

Discrete, continuous, and hybrid automataNormally automata theory describes the states of abstract machines but there are analog automata or continuousautomata or hybrid discrete-continuous automata, which use analog data, continuous time, or both.

ApplicationsEach model in automata theory plays important roles in several applied areas. Finite automata are used in textprocessing, compilers, and hardware design. Context-free grammar (CFGs) are used in programming languages andartificial intelligence. Originally, CFGs were used in the study of the human languages. Cellular automata are used inthe field of biology, the most common example being John Conway's Game of Life. Some other examples whichcould be explained using automata theory in biology include mollusk and pine cones growth and pigmentationpatterns. Going further, a theory suggesting that the whole universe is computed by some sort of a discreteautomaton, is advocated by some scientists. The idea originated in the work of Konrad Zuse, and was popularized inAmerica by Edward Fredkin.

Automata SimulatorsAutomata simulators are pedagogical tools used to teach, learn and research automata theory. An automata simulatortakes as input the description of an automaton and then simulates its working for an arbitrary input string. Thedescription of the automaton can be entered in several ways. An automaton can be defined in a symbolic language orits specification may be entered in a predesigned form or its transition diagram may be drawn by clicking anddragging the mouse. Well known automata simulators include Turing’s World, JFLAP, VAS, TAGS andSimStudio.[1]

Connection to Category theoryOne can define several distinct categories of automata[2] following the automata classification into different types described in the previous section. The mathematical category of deterministic automata, sequential machines or sequential automata, and Turing machines with automata homomorphisms defining the arrows between automata is a Cartesian closed category,[3][4] it has both categorical limits and colimits. An automata homomorphism maps a quintuple of an automaton Ai onto the quintuple of another automaton Aj.

[5] Automata homomorphisms can also be considered as automata transformations or as semigroup homomorphisms, when the state space,S, of the automaton is defined as a semigroup Sg. Monoids are also considered as a suitable setting for automata in monoidal

Automata theory 9


Categories of variable automataOne could also define a variable automaton, in in the sense of Norbert Wiener in his book on "Human Use of HumanBeings" via the endomorphisms Ai-->Ai. Then, one can show that such variable automata homomorphisms form amathematical group. In the case of non-deterministic, or other complex kinds of automata, the latter set ofendomorphisms may become, however, a variable automaton groupoid. Therefore, in the most general case,categories of variable automata of any kind are categories of groupoids[9] or groupoid categories. Moreover, thecategory of reversible automata is then a 2-category, and also a subcategory of the 2-category of groupoids, or thegroupoid category.

Regular language 10

Regular languageIn theoretical computer science and formal language theory, a regular language is a formal language that can beexpressed using a regular expression. Note that the "regular expression" features provided with many programminglanguages are augmented with features that make them capable of recognizing languages that can not be expressedby the formal regular expressions (as formally defined below).In the Chomsky hierarchy, regular languages are defined to be the languages that are generated by Type-3 grammars(regular grammars). Regular languages are very useful in input parsing and programming language design.

Formal definitionThe collection of regular languages over an alphabet Σ is defined recursively as follows:•• The empty language Ø is a regular language.• For each a ∈ Σ (a belongs to Σ), the singleton language {a} is a regular language.• If A and B are regular languages, then A ∪ B (union), A • B (concatenation), and A* (Kleene star) are regular

languages.•• No other languages over Σ are regular.See regular expression for its syntax and semantics. Note that the above cases are in effect the defining rules ofregular expression.ExamplesAll finite languages are regular; in particular the empty string language {ε} = Ø* is regular. Other typical examplesinclude the language consisting of all strings over the alphabet {a, b} which contain an even number of as, or thelanguage consisting of all strings of the form: several as followed by several bs.

A simple example of a language that is not regular is the set of strings .[1] Intuitively, it cannot berecognized with a finite automaton, since a finite automaton has finite memory and it cannot remember the exactnumber of a's. Techniques to prove this fact rigorously are given below.

Equivalence to other formalismsA regular language satisfies the following equivalent properties:• it can be accepted by a (nondeterministic) finite automaton, but also by the restricted deterministic finite

automaton, or the more general alternating finite automaton• it can be generated by a regular grammar• it can be generated by a prefix grammar• it can be accepted by a read-only Turing machine• it can be defined in monadic second-order logic• it is recognized by some finite monoid, meaning it is the preimage of a subset of a finite monoid under a

homomorphism from the free monoid on its alphabetThe above properties are sometimes used as alternative definition of regular languages.

Regular language 11

Closure propertiesThe regular languages are closed under the various operations, that is, if the languages K and L are regular, so is theresult of the following operations:• the set theoretic Boolean operations: union , intersection , and complement . From this also

difference follows.• the regular operations: union , concatenation , and Kleene star .• the trio operations: string homomorphism, inverse string homomorphism, and intersection with regular languages.

As a consequence they are closed under arbitrary finite state transductions, like quotient with a regularlanguage. Even more, regular languages are closed under quotients with arbitrary languages: If L is regular thenL/K is regular for any K.

• the reverse (or mirror image) .

Deciding whether a language is regular

Regular language in classes of Chomsky hierarchy.

To locate the regular languages in the Chomskyhierarchy, one notices that every regular language iscontext-free. The converse is not true: for example thelanguage consisting of all strings having the samenumber of a's as b's is context-free but not regular. Toprove that a language such as this is regular, one oftenuses the Myhill–Nerode theorem or the pumpinglemma among other methods.[2]

There are two purely algebraic approaches to defineregular languages. If:•• Σ is a finite alphabet,• Σ* denotes the free monoid over Σ consisting of all

strings over Σ,• f : Σ* → M is a monoid homomorphism where M is a finite monoid,• S is a subset of M

then the set is regular. Every regular language arises in this fashion.If L is any subset of Σ*, one defines an equivalence relation ~ (called the syntactic relation) on Σ* as follows: u ~ v isdefined to mean

uw ∈ L if and only if vw ∈ L for all w ∈ Σ*The language L is regular if and only if the number of equivalence classes of ~ is finite (A proof of this is provided inthe article on the syntactic monoid). When a language is regular, then the number of equivalence classes is equal tothe number of states of the minimal deterministic finite automaton accepting L.A similar set of statements can be formulated for a monoid . In this case, equivalence over M leads to theconcept of a recognizable language.

Regular language 12

Complexity resultsIn computational complexity theory, the complexity class of all regular languages is sometimes referred to asREGULAR or REG and equals DSPACE(O(1)), the decision problems that can be solved in constant space (thespace used is independent of the input size). REGULAR ≠ AC0, since it (trivially) contains the parity problem ofdetermining whether the number of 1 bits in the input is even or odd and this problem is not in AC0.[3] On the otherhand, REGULAR does not contain AC0, because the nonregular language of palindromes, or the nonregularlanguage can both be recognized in AC0.[4]

If a language is not regular, it requires a machine with at least Ω(log log n) space to recognize (where n is the inputsize).[5] In other words, DSPACE(o(log log n)) equals the class of regular languages. In practice, most nonregularproblems are solved by machines taking at least logarithmic space.

SubclassesImportant subclasses of regular languages include• Finite languages - those containing only a finite number of words. These are regular languages, as one can create

a regular expression that is the union of every word in the language.• Star-free languages, those that can be described by a regular expression constructed from the empty symbol,

letters, concatenation and all boolean operators including complementation but not the Kleene star: this classincludes all finite languages.[6]

• Cyclic languages, satisfying the conditions and .[7]

The number of words in a regular languageLet denote the number of words of length in . The ordinary generating function for L is the formalpower series

The generating function of a language L is a rational function if and only if L is regular.[7] Hence for any infiniteregular language there exist constants and polynomials such that for every

the number of words of length in satisfies the equation .[8][9]

Thus, non-regularity of certain infinite languages can be proved by counting the words of a given length in .Consider, for example, the Dyck language of strings of balanced parentheses. The number of words of length in

the Dyck language is equal to the Catalan number , which is not of the form , witnessing

the non-regularity of the Dyck language.The zeta function of a language L is[7]

The zeta function of a regular language is not in general rational, but that of a cyclic language is.[10]

Regular language 13

GeneralizationsThe notion of a regular language has been generalized to infinite words (see ω-automata) and to trees (see treeautomaton).

Deterministic finite automaton 14

Deterministic finite automaton

An example of a deterministic finite automaton that accepts onlybinary numbers that are multiples of 3. The state S0 is both the start

state and an accept state.

In automata theory, a branch of theoretical computerscience, a deterministic finite automaton(DFA)—also known as deterministic finite statemachine—is a finite state machine that accepts/rejectsfinite strings of symbols and only produces a uniquecomputation (or run) of the automaton for each inputstring.[1] 'Deterministic' refers to the uniqueness of thecomputation. In search of simplest models to capturethe finite state machines, McCulloch and Pitts wereamong the first researchers to introduce a conceptsimilar to finite automaton in 1943.[2][3]

The figure at right illustrates a deterministic finite automaton using state diagram. In the automaton, there are threestates: S0, S1, and S2 (denoted graphically by circles). The automaton takes finite sequence of 0s and 1s as input.For each state, there is a transition arrow leading out to a next state for both 0 and 1. Upon reading a symbol, a DFAjumps deterministically from a state to another by following the transition arrow. For example, if the automaton iscurrently in state S0 and current input symbol is 1 then it deterministically jumps to state S1. A DFA has a start state(denoted graphically by an arrow coming in from nowhere) where computations begin, and a set of accept states(denoted graphically by a double circle) which help define when a computation is successful.

A DFA is defined as an abstract mathematical concept, but due to the deterministic nature of a DFA, it isimplementable in hardware and software for solving various specific problems. For example, a DFA can model asoftware that decides whether or not online user-input such as email addresses are valid.[4] (see: finite state machinefor more practical examples).DFAs recognize exactly the set of regular languages[5] which are, among other things, useful for doing lexicalanalysis and pattern matching. DFAs can be built from nondeterministic finite automata through the powersetconstruction.

Formal definitionA deterministic finite automaton M is a 5-tuple, (Q, Σ, δ, q0, F), consisting of• a finite set of states (Q)• a finite set of input symbols called the alphabet (Σ)• a transition function (δ : Q × Σ → Q)• a start state (q0 ∈ Q)• a set of accept states (F ⊆ Q)Let w = a1a2 ... an be a string over the alphabet Σ. The automaton M accepts the string w if a sequence of states,r0,r1, ..., rn, exists in Q with the following conditions:1. r0 = q02. ri+1 = δ(ri, ai+1), for i = 0, ..., n−13. rn ∈ F.In words, the first condition says that the machine starts in the start state q0. The second condition says that given each character of string w, the machine will transition from state to state according to the transition function δ. The last condition says that the machine accepts w if the last input of w causes the machine to halt in one of the accepting states. Otherwise, it is said that the automaton rejects the string. The set of strings M accepts is the language

Deterministic finite automaton 15

recognized by M and this language is denoted by L(M).A deterministic finite automaton without accept states and without a starting state is known as a transition system orsemiautomaton.For more comprehensive introduction of the formal definition see automata theory.

ExampleThe following example is of a DFA M, with a binary alphabet, which requires that the input contains an even numberof 0s.

The state diagram for M

M = (Q, Σ, δ, q0, F) where• Q = {S1, S2},•• Σ = {0, 1},• q0 = S1,• F = {S1}, and• δ is defined by the following state transition table:

0 1


S2 S1


S1 S2

The state S1 represents that there has been an even number of 0s in the input so far, while S2 signifies an odd number.A 1 in the input does not change the state of the automaton. When the input ends, the state will show whether theinput contained an even number of 0s or not. If the input did contain an even number of 0s, M will finish in state S1,an accepting state, so the input string will be accepted.The language recognized by M is the regular language given by the regular expression 1*( 0 (1*) 0 (1*) )*, where"*" is the Kleene star, e.g., 1* denotes any non-negative number (possibly zero) of symbols "1".

Closure propertiesIf DFAs recognize the languages that are obtained by applying an operation on the DFA recognizable languages thenDFAs are said to be closed under the operation. The DFAs are closed under the following operations.•• Union•• Intersection•• Concatenation•• Negation•• Kleene closureSince DFAs are equivalent to nondeterministic finite automaton(NFA), the above closures are proved using closureproperties of NFA.

Deterministic finite automaton 16

Accept and Generate modesA DFA representing a regular language can be used either in an accepting mode to validate that an input string is partof the language, or in a generating mode to generate a list of all the strings in the language.In the accept mode an input string is provided which the automaton can read in left to right, one symbol at a time.The computation begins at the start state and proceeds by reading the first symbol from the input string andfollowing the state transition corresponding to that symbol. The system continues reading symbols and followingtransitions until there are no more symbols in the input, which marks the end of the computation. If after all inputsymbols have been processed the system is in an accept state then we know that the input string was indeed part ofthe language, and it is said to be accepted, otherwise it is not part of the language and it is not accepted.The generating mode is similar except that rather than validating an input string its goal is to produce a list of all thestrings in the language. Instead of following a single transition out of each state, it follows all of them. In practicethis can be accomplished by massive parallelism (having the program branch into two or more processes each time itis faced with a decision) or through recursion. As before, the computation begins at the start state and then proceedsto follow each available transition, keeping track of which branches it took. Every time the automaton finds itself inan accept state it knows that the sequence of branches it took forms a valid string in the language and it adds thatstring to the list that it is generating. If the language this automaton describes is infinite (ie contains an infinitenumber or strings, such as "all the binary string with an even number of 0s) then the computation will never halt.Given that regular languages are, in general, infinite, automata in the generating mode tends to be more of atheoretical construct .

DFA as a transition monoidAlternatively a run can be seen as a sequence of compositions of transition function with itself. Given an inputsymbol , one may write the transition function as , using the simple trick of currying, that is,writing for all . This way, the transition function can be seen in simpler terms: it's justsomething that "acts" on a state in Q, yielding another state. One may then consider the result of functioncomposition repeatedly applied to the various functions , , and so on. Using this notion we define

. Given a pair of letters , one may define a new function , by insisting that, where denotes function composition. Clearly, this process can be recursively continued. So, we

have following recursive definitionwhere is empty string and

where and .is defined for all words . Repeated function composition forms a monoid. For the transition functions,

this monoid is known as the transition monoid, or sometimes the transformation semigroup. The construction canalso be reversed: given a , one can reconstruct a , and so the two descriptions are equivalent.

Deterministic finite automaton 17

Advantages and disadvantagesDFAs were invented to model real world finite state machines in contrast to the concept of a Turing machine, whichwas too general to study properties of real world machines.DFAs are one of the most practical models of computation, since there is a trivial linear time, constant-space, onlinealgorithm to simulate a DFA on a stream of input. Also, there are efficient algorithms to find a DFA recognizing:•• the complement of the language recognized by a given DFA.•• the union/intersection of the languages recognized by two given DFAs.Because DFAs can be reduced to a canonical form (minimal DFAs), there are also efficient algorithms to determine:•• whether a DFA accepts any strings•• whether a DFA accepts all strings•• whether two DFAs recognize the same language•• the DFA with a minimum number of states for a particular regular languageDFAs are equivalent in computing power to nondeterministic finite automata (NFAs). This is because, firstly anyDFA is also an NFA, so an NFA can do what a DFA can do. Also, given an NFA, using the powerset constructionone can build a DFA that recognizes the same language as the NFA, although the DFA could have exponentiallylarger number of states than the NFA.On the other hand, finite state automata are of strictly limited power in the languages they can recognize; manysimple languages, including any problem that requires more than constant space to solve, cannot be recognized by aDFA. The classical example of a simply described language that no DFA can recognize is bracket language, i.e.,language that consists of properly paired brackets such as word "(()())". No DFA can recognize the bracket languagebecause there is no limit to recursion, i.e., one can always embed another pair of brackets inside. It would require aninfinite amount of states to recognize. Another simpler example is the language consisting of strings of the formanbn—some finite number of a's, followed by an equal number of b's.

Context-free languageIn formal language theory, a context-free language is a language generated by some context-free grammar. The setof all context-free languages is identical to the set of languages accepted by pushdown automata.

ExamplesAn archetypical context-free language is , the language of all non-empty even-lengthstrings, the entire first halves of which are 's, and the entire second halves of which are 's. is generated bythe grammar , and is accepted by the pushdown automaton

where is defined as follows:

Context-free languages have many applications in programming languages; for example, the language of all properlymatched parentheses is generated by the grammar . Also, most arithmetic expressions aregenerated by context-free grammars.

Closure propertiesContext-free languages are closed under the following operations. That is, if L and P are context-free languages, thefollowing languages are context-free as well:• the union of L and P• the reversal of L• the concatenation of L and P• the Kleene star of L• the image of L under a homomorphism • the image of L under an inverse homomorphism • the cyclic shift of L (the language )Context-free languages are not closed under complement, intersection, or difference. However, if L is a context-freelanguage and D is a regular language then both their intersection and their difference arecontext-free languages.

Context-free language 19

Nonclosure under intersection and complementThe context-free languages are not closed under intersection. This can be seen by taking the languages

and , which are both context-free. Theirintersection is , which can be shown to be non-context-free by the pumping lemmafor context-free languages.Context-free languages are also not closed under complementation, as for any languages A and B:


Decidability propertiesThe following problems are undecidable for arbitrary context-free grammars A and B:

• Equivalence: is ?• is ? (However, the intersection of a context-free language and a regular language is

context-free, so if were a regular language, this problem becomes decidable.)• is ?• is ?The following problems are decidable for arbitrary context-free languages:

• is ?• is finite?• Membership: given any word , does ? (membership problem is even polynomially decidable - see

CYK algorithm and Earley's Algorithm)

Properties of context-free languages•• The reverse of a context-free language is context-free, but the complement need not be.• Every regular language is context-free because it can be described by a context-free grammar.•• The intersection of a context-free language and a regular language is always context-free.• There exist context-sensitive languages which are not context-free.• To prove that a given language is not context-free, one may employ the pumping lemma for context-free

languages or a number of other methods, such as Ogden's lemma, Parikh's theorem, or using closure properties.[1]

• Context Free Languages are closed under Union, Concatenation, and Kleene star.[2]

ParsingDetermining an instance of the membership problem; i.e. given a string , determine whether where

is the language generated by some grammar ; is also known as parsing.Formally, the set of all context-free languages is identical to the set of languages accepted by pushdown automata(PDA). Parser algorithms for context-free languages include the CYK algorithm and the Earley's Algorithm.A special subclass of context-free languages are the deterministic context-free languages which are defined as the setof languages accepted by a deterministic pushdown automaton and can be parsed by a LR(k) parser.[3]

See also parsing expression grammar as an alternative approach to grammar and parser.

Context-free language 20

Pushdown automatonIn computer science, a pushdown automaton (PDA) is a type of automaton that employs a stack.The PDA is used in theories about what can be computed by machines. It is more capable than a finite-state machinebut less capable than a Turing machine. Because its input can be described with a formal grammar, it can be used inparser design. The deterministic pushdown automaton can handle all deterministic context-free languages while thenondeterministic version can handle all context-free languages.The term "pushdown" refers to the fact that the stack can be regarded as being "pushed down" like a tray dispenser ata cafeteria, since the operations never work on elements other than the top element. A stack automaton, by contrast,does allow access to and operations on deeper elements. Stack automata can recognize a strictly larger set oflanguages than deterministic pushdown automata. A nested stack automaton allows full access, and also allowsstacked values to be entire sub-stacks rather than just single finite symbols.The remainder of this article describes the nondeterministic pushdown automaton.


a diagram of the pushdown automaton

Pushdown automata differ from finitestate machines in two ways:

1.1. They can use the top of the stack todecide which transition to take.

2.2. They can manipulate the stack aspart of performing a transition.

Pushdown automata choose atransition by indexing a table by inputsignal, current state, and the symbol atthe top of the stack. This means thatthose three parameters completelydetermine the transition path that ischosen. Finite state machines just look at the input signal and the current state: they have no stack to work with.Pushdown automata add the stack as a parameter for choice.Pushdown automata can also manipulate the stack, as part of performing a transition. Finite state machines choose anew state, the result of following the transition. The manipulation can be to push a particular symbol to the top of the

Pushdown automaton 21

stack, or to pop off the top of the stack. The automaton can alternatively ignore the stack, and leave it as it is. Thechoice of manipulation (or no manipulation) is determined by the transition table.Put together: Given an input signal, current state, and stack symbol, the automaton can follow a transition to anotherstate, and optionally manipulate (push or pop) the stack.In general, pushdown automata may have several computations on a given input string, some of which may behalting in accepting configurations. If in every situation only one transition is available as continuation of thecomputation, the result is a deterministic pushdown automaton (DPDA) otherwise it is a nondeterministic PDA(NDPDA or NPDA). Only context-free languages for which an unambiguous grammar exists can be recognized by aDPDA, namely the deterministic context-free languages. Not all context-free languages are deterministic and theproblem of deciding whether a context-free language is deterministic is unsolvable.[1] As a consequence of the abovethe DPDA is a strictly weaker variant of the PDA and there exists no algorithm for converting a PDA to anequivalent DPDA, if such a DPDA exists.If we allow a finite automaton access to two stacks instead of just one, we obtain a more powerful device, equivalentin power to a Turing machine. A linear bounded automaton is a device which is more powerful than a pushdownautomaton but less so than a Turing machine.

Relation to backtrackingNondeterministic PDAs are able to handle situations where more than one choices of action are available. Inprinciple it is enough to create in every such case new automaton instances that will handle the extra choices. Theproblem with this approach is that in practice most of these instances quickly fail. This can severely affect theautomaton's performance as the execution of multiple instances is a costly operation. Situations such as these can beidentified in the design phase of the automaton by examining the grammar the automaton uses. This makes possiblethe use of backtracking in every such case in order to improve performance.

Formal DefinitionWe use standard formal language notation: denotes the set of strings over alphabet and denotes the emptystring.A PDA is formally defined as a 7-tuple:

where• is a finite set of states• is a finite set which is called the input alphabet• is a finite set which is called the stack alphabet• is a finite subset of , the transition relation.• is the start state• is the initial stack symbol• is the set of accepting states

An element is a transition of . It has the intended meaning that , in state ,with on the input and with as topmost stack symbol, may read , change the state to ,pop , replacing it by pushing . The component of the transition relation is used toformalize that the PDA can either read a letter from the input, or proceed leaving the input untouched.In many texts the transition relation is replaced by an (equivalent) formalization, where

• is the transition function, mapping into finite subsets of .Here contains all possible actions in state with on the stack, while reading on the input. One writes for the function precisely when for the relation. Note that finite

Page 24: Theory of Computation

Pushdown automaton 22

in this definition is essential.Computations

a step of the pushdown automaton

In order to formalize the semantics of the pushdown automaton adescription of the current situation is introduced. Any 3-tuple

is called an instantaneousdescription (ID) of , which includes the current state, the partof the input tape that has not been read, and the contents of thestack (topmost symbol written first). The transition relation defines the step-relation of on instantaneous descriptions.For instruction there exists a step

, for every and every.

In general pushdown automata are nondeterministic meaning thatin a given instantaneous description there may beseveral possible steps. Any of these steps can be chosen in acomputation. With the above definition in each step always asingle symbol (top of the stack) is popped, replacing it with asmany symbols as necessary. As a consequence no step is definedwhen the stack is empty.Computations of the pushdown automaton are sequences of steps. The computation starts in the initial state withthe initial stack symbol on the stack, and a string on the input tape, thus with initial description .There are two modes of accepting. The pushdown automaton either accepts by final state, which means after readingits input the automaton reaches an accepting state (in ), or it accepts by empty stack ( ), which means afterreading its input the automaton empties its stack. The first acceptance mode uses the internal memory (state), thesecond the external memory (stack).Formally one defines

1. with and (final state)2. with (empty stack)Here represents the reflexive and transitive closure of the step relation meaning any number ofconsecutive steps (zero, one or more).For each single pushdown automaton these two languages need to have no relation: they may be equal but usuallythis is not the case. A specification of the automaton should also include the intended mode of acceptance. Takenover all pushdown automata both acceptance conditions define the same family of languages.Theorem. For each pushdown automaton one may construct a pushdown automaton such that

, and vice versa, for each pushdown automaton one may construct a pushdown automatonsuch that

Page 25: Theory of Computation

ExampleThe following is the formal description of the PDA which recognizes the language by final state:

PDA for (by final state)

, where

consists of the following six instructions:

, ,, , , and .

In words, in state for each symbol read, one is pushed onto the stack. Pushing symbol on top of anotheris formalized as replacing top by . In state for each symbol read one is popped. At any

moment the automaton may move from state to state , while it may move from state to accepting state only when the stack consists of a single .

There seems to be no generally used representation for PDA. Here we have depicted the instruction by an edge from state to state labelled by (read ; replace by ).

Understanding the computation process

accepting computation for

The following illustrates how theabove PDA computes on differentinput strings. The subscript fromthe step symbol is here omitted.

(a) Input string = 0011. There arevarious computations, depending onthe moment the move from state tostate is made. Only one of these isaccepting.

(i) . The final state is accepting, but the input is not accepted this way as ithas not been read.(ii) . No further steps possible.(iii)

. Accepting computation: ends in accepting state, while complete input has been read.(b) Input string = 00111. Again there are various computations. None of these is accepting.

(i) . The final state is accepting, but the input is notaccepted this way as it has not been read.(ii) . No further steps possible.(iii)

. The final state is accepting, but the input is not accepted this wayas it has not been (completely) read.

Pushdown automaton 24

PDA and Context-free LanguagesEvery context-free grammar can be transformed into an equivalent pushdown automaton. The derivation process ofthe grammar is simulated in a leftmost way. Where the grammar rewrites a nonterminal, the PDA takes the topmostnonterminal from its stack and replaces it by the right-hand part of a grammatical rule (expand). Where the grammargenerates a terminal symbol, the PDA reads a symbol from input when it is the topmost symbol on the stack (match).In a sense the stack of the PDA contains the unprocessed data of the grammar, corresponding to a pre-order traversalof a derivation tree.Technically, given a context-free grammar, the PDA is constructed as follows.

1. for each rule (expand)2. for each terminal symbol (match)As a result we obtain a single state pushdown automaton, the state here is , accepting the context-free language byempty stack. Its initial stack symbol equals the axiom of the context-free grammar.The converse, finding a grammar for a given PDA, is not that easy. The trick is to code two states of the PDA intothe nonterminals of the grammar.Theorem. For each pushdown automaton one may construct a context-free grammar such that


Generalized Pushdown Automaton (GPDA)A GPDA is a PDA which writes an entire string of some known length to the stack or removes an entire string fromthe stack in one step.A GPDA is formally defined as a 6-tuple:

where Q, , , q0 and F are defined the same way as a PDA.

: is the transition function.Computation rules for a GPDA are the same as a PDA except that the ai+1's and bi+1's are now strings instead ofsymbols.GPDA's and PDA's are equivalent in that if a language is recognized by a PDA, it is also recognized by a GPDA andvice versa.One can formulate an analytic proof for the equivalence of GPDA's and PDA's using the following simulation:Let (q1, w, x1x2...xm) (q2, y1y2...yn) be a transition of the GPDA

where , , , , , .Construct the following transitions for the PDA:

(q1, w, x1) (p1, )(p1, , x2) (p2, )

(pm+n-1, , ) (q2, y1)

Page 27: Theory of Computation

Pushdown automaton 25

Recursively enumerable setIn computability theory, traditionally called recursion theory, a set S of natural numbers is called recursivelyenumerable, computably enumerable, semidecidable, provable or Turing-recognizable if:• There is an algorithm such that the set of input numbers for which the algorithm halts is exactly S.Or, equivalently,• There is an algorithm that enumerates the members of S. That means that its output is simply a list of the members

of S: s1, s2, s3, ... . If necessary, this algorithm may run forever.The first condition suggests why the term semidecidable is sometimes used; the second suggests why computablyenumerable is used. The abbreviations r.e. and c.e. are often used, even in print, instead of the full phrase.In computational complexity theory, the complexity class containing all recursively enumerable sets is RE. Inrecursion theory, the lattice of r.e. sets under inclusion is denoted .

Formal definitionA set S of natural numbers is called recursively enumerable if there is a partial recursive function (synonymously, apartial computable function) whose domain is exactly S, meaning that the function is defined if and only if its inputis a member of S.

Equivalent formulationsThe following are all equivalent properties of a set S of natural numbers:

Semidecidability:• The set S is recursively enumerable. That is, S is the domain (co-range) of a partial recursive function.• There is a partial recursive function f such that:

Enumerability:• The set S is the range of a partial recursive function.

Recursively enumerable set 26

• The set S is the range of a total recursive function or empty. If S is infinite, the function can be chosen to beinjective.

• The set S is the range of a primitive recursive function or empty. Even if S is infinite, repetition of values maybe necessary in this case.Diophantine:

• There is a polynomial p with integer coefficients and variables x, a, b, c, d, e, f, g, h, i ranging over the naturalnumbers such that

• There is a polynomial from the integers to the integers such that the set S contains exactly the non-negativenumbers in its range.

The equivalence of semidecidability and enumerability can be obtained by the technique of dovetailing.The Diophantine characterizations of a recursively enumerable set, while not as straightforward or intuitive as thefirst definitions, were found by Yuri Matiyasevich as part of the negative solution to Hilbert's Tenth Problem.Diophantine sets predate recursion theory and are therefore historically the first way to describe these sets (althoughthis equivalence was only remarked more than three decades after the introduction of recursively enumerable sets).The number of bound variables in the above definition of the Diophantine set is the best known so far; it might bethat a lower number can be used to define all diophantine sets.

Examples• Every recursive set is recursively enumerable, but it is not true that every recursively enumerable set is recursive.• A recursively enumerable language is a recursively enumerable subset of a formal language.•• The set of all provable sentences in an effectively presented axiomatic system is a recursively enumerable set.• Matiyasevich's theorem states that every recursively enumerable set is a Diophantine set (the converse is trivially

true).• The simple sets are recursively enumerable but not recursive.• The creative sets are recursively enumerable but not recursive.• Any productive set is not recursively enumerable.• Given a Gödel numbering of the computable functions, the set (where is the

Cantor pairing function and indicates is defined) is recursively enumerable. This set encodes thehalting problem as it describes the input parameters for which each Turing machine halts.

• Given a Gödel numbering of the computable functions, the set is recursivelyenumerable. This set encodes the problem of deciding a function value.

• Given a partial function f from the natural numbers into the natural numbers, f is a partial recursive function if andonly if the graph of f, that is, the set of all pairs such that f(x) is defined, is recursively enumerable.

PropertiesIf A and B are recursively enumerable sets then A ∩ B, A ∪ B and A × B (with the ordered pair of natural numbersmapped to a single natural number with the Cantor pairing function) are recursively enumerable sets. The preimageof a recursively enumerable set under a partial recursive function is a recursively enumerable set.

A set is recursively enumerable if and only if it is at level of the arithmetical hierarchy.A set is called co-recursively enumerable or co-r.e. if its complement is recursively enumerable.Equivalently, a set is co-r.e. if and only if it is at level of the arithmetical hierarchy.A set A is recursive (synonym: computable) if and only if both A and the complement of A are recursivelyenumerable. A set is recursive if and only if it is either the range of an increasing total recursive function or finite.

Page 29: Theory of Computation

Some pairs of recursively enumerable sets are effectively separable and some are not.

RemarksAccording to the Church-Turing thesis, any effectively calculable function is calculable by a Turing machine, andthus a set S is recursively enumerable if and only if there is some algorithm which yields an enumeration of S. Thiscannot be taken as a formal definition, however, because the Church-Turing thesis is an informal conjecture ratherthan a formal axiom.The definition of a recursively enumerable set as the domain of a partial function, rather than the range of a totalrecursive function, is common in contemporary texts. This choice is motivated by the fact that in generalizedrecursion theories, such as α-recursion theory, the definition corresponding to domains has been found to be morenatural. Other texts use the definition in terms of enumerations, which is equivalent for recursively enumerable sets.

An artistic representation of a Turing machine(Rules table not represented)

A Turing machine is a device that manipulates symbols on a strip oftape according to a table of rules. Despite its simplicity, a Turingmachine can be adapted to simulate the logic of any computeralgorithm, and is particularly useful in explaining the functions of aCPU inside a computer.

The "Turing" machine was described in 1936 by Alan Turing[1] whocalled it an "a-machine" (automatic machine). The Turing machine isnot intended as practical computing technology, but rather as ahypothetical device representing a computing machine. Turingmachines help computer scientists understand the limits of mechanical computation.

Turing gave a succinct definition of the experiment in his 1948 essay, "Intelligent Machinery". Referring to his 1936publication, Turing wrote that the Turing machine, here called a Logical Computing Machine, consisted of: unlimited memory capacity obtained in the form of an infinite tape marked out into squares, oneach of which a symbol could be printed. At any moment there is one symbol in the machine; it is calledthe scanned symbol. The machine can alter the scanned symbol and its behavior is in part determined bythat symbol, but the symbols on the tape elsewhere do not affect the behaviour of the machine.However, the tape can be moved back and forth through the machine, this being one of the elementaryoperations of the machine. Any symbol on the tape may therefore eventually have an innings.[2] (Turing1948, p. 61)

A Turing machine that is able to simulate any other Turing machine is called a universal Turing machine (UTM, or simply a universal machine). A more mathematically oriented definition with a similar "universal" nature was introduced by Alonzo Church, whose work on lambda calculus intertwined with Turing's in a formal theory of computation known as the Church–Turing thesis. The thesis states that Turing machines indeed capture the informal

Turing machine 28

notion of effective method in logic and mathematics, and provide a precise definition of an algorithm or 'mechanicalprocedure'.Studying their abstract properties yields many insights into computer science and complexity theory.

Informal descriptionFor visualizations of Turing machines, see Turing machine gallery.

The Turing machine mathematically models a machine that mechanically operates on a tape. On this tape aresymbols which the machine can read and write, one at a time, using a tape head. Operation is fully determined by afinite set of elementary instructions such as "in state 42, if the symbol seen is 0, write a 1; if the symbol seen is 1,change into state 17; in state 17, if the symbol seen is 0, write a 1 and change to state 6;" etc. In the original article("On computable numbers, with an application to the Entscheidungsproblem", see also references below), Turingimagines not a mechanism, but a person whom he calls the "computer", who executes these deterministic mechanicalrules slavishly (or as Turing puts it, "in a desultory manner").

The head is always over a particular square of the tape; only a finite stretch ofsquares is shown. The instruction to be performed (q4) is shown over the scanned

square. (Drawing after Kleene (1952) p.375.)

Here, the internal state (q1) is shown inside the head, and the illustration describesthe tape as being infinite and pre-filled with "0", the symbol serving as blank. Thesystem's full state (its configuration) consists of the internal state, the contents of

the shaded squares including the blank scanned by the head ("11B"), and theposition of the head. (Drawing after Minsky (1967) p. 121).

More precisely, a Turing machine consistsof:1. A tape which is divided into cells, one

next to the other. Each cell contains asymbol from some finite alphabet. Thealphabet contains a special blank symbol(here written as 'B') and one or moreother symbols. The tape is assumed to bearbitrarily extendable to the left and tothe right, i.e., the Turing machine isalways supplied with as much tape as itneeds for its computation. Cells that havenot been written to before are assumed tobe filled with the blank symbol. In somemodels the tape has a left end markedwith a special symbol; the tape extends oris indefinitely extensible to the right.

2. A head that can read and write symbols on the tape and move the tape left and right one (and only one) cell at atime. In some models the head moves and the tape is stationary.

3. A state register that stores the state of the Turing machine, one of finitely many. There is one special start statewith which the state register is initialized. These states, writes Turing, replace the "state of mind" a personperforming computations would ordinarily be in.

4. A finite table (occasionally called an action table or transition function) of instructions (usually quintuples[5-tuples] : qiaj→qi1aj1dk, but sometimes 4-tuples) that, given the state(qi) the machine is currently in and thesymbol(aj) it is reading on the tape (symbol currently under the head) tells the machine to do the following insequence (for the 5-tuple models):• Either erase or write a symbol (replacing aj with aj1), and then• Move the head (which is described by dk and can have values: 'L' for one step left or 'R' for one step right or

'N' for staying in the same place), and then• Assume the same or a new state as prescribed (go to state qi1).In the 4-tuple models, erasing or writing a symbol (aj1) and moving the head left or right (dk) are specified as separate instructions. Specifically, the table tells the machine to (ia) erase or write a symbol or (ib) move the head

Page 31: Theory of Computation

Turing machine 29

left or right, and then (ii) assume the same or a new state as prescribed, but not both actions (ia) and (ib) in thesame instruction. In some models, if there is no entry in the table for the current combination of symbol and statethen the machine will halt; other models require all entries to be filled.

Note that every part of the machine—its state and symbol-collections—and its actions—printing, erasing and tapemotion—is finite, discrete and distinguishable; it is the potentially unlimited amount of tape that gives it anunbounded amount of storage space.

Formal definitionHopcroft and Ullman (1979, p. 148) formally define a (one-tape) Turing machine as a 7-tuple


• is a finite, non-empty set of states• is a finite, non-empty set of the tape alphabet/symbols• is the blank symbol (the only symbol allowed to occur on the tape infinitely often at any step during the

computation)• is the set of input symbols• is the initial state• is the set of final or accepting states.• is a partial function called the transition function, where L is left shift,

R is right shift. (A relatively uncommon variant allows "no shift", say N, as a third element of the latter set.)Anything that operates according to these specifications is a Turing machine.The 7-tuple for the 3-state busy beaver looks like this (see more about this busy beaver at Turing machine examples):

••• ("blank")•• (the initial state)•• see state-table belowInitially all tape cells are marked with 0.

State table for 3 state, 2 symbol busy beaver

Tape symbol Current state A Current state B Current state C

Write symbol Move tape Next state Write symbol Move tape Next state Write symbol Move tape Next state

0 1 R B 1 L A 1 L B

1 1 L C 1 R B 1 R HALT

Page 32: Theory of Computation

Additional details required to visualize or implement Turing machinesIn the words of van Emde Boas (1990), p. 6: "The set-theoretical object [his formal seven-tuple description similar tothe above] provides only partial information on how the machine will behave and what its computations will looklike."For instance,•• There will need to be many decisions on what the symbols actually look like, and a failproof way of reading and

writing symbols indefinitely.•• The shift left and shift right operations may shift the tape head across the tape, but when actually building a

Turing machine it is more practical to make the tape slide back and forth under the head instead.• The tape can be finite, and automatically extended with blanks as needed (which is closest to the mathematical

definition), but it is more common to think of it as stretching infinitely at both ends and being pre-filled withblanks except on the explicitly given finite fragment the tape head is on. (This is, of course, not implementable inpractice.) The tape cannot be fixed in length, since that would not correspond to the given definition and wouldseriously limit the range of computations the machine can perform to those of a linear bounded automaton.

Alternative definitionsDefinitions in literature sometimes differ slightly, to make arguments or proofs easier or clearer, but this is alwaysdone in such a way that the resulting machine has the same computational power. For example, changing the set

to , where N ("None" or "No-operation") would allow the machine to stay on the same tapecell instead of moving left or right, does not increase the machine's computational power.The most common convention represents each "Turing instruction" in a "Turing table" by one of nine 5-tuples, perthe convention of Turing/Davis (Turing (1936) in Undecidable, p. 126-127 and Davis (2000) p. 152):

(definition 1): (qi, S

j, S

k/E/N, L/R/N, q


( current state qi , symbol scanned S

j , print symbol S

k/erase E/none N , move_tape_one_square left

L/right R/none N , new state qm

)Other authors (Minsky (1967) p. 119, Hopcroft and Ullman (1979) p. 158, Stone (1972) p. 9) adopt a differentconvention, with new state q

m listed immediately after the scanned symbol Sj:

(definition 2): (qi, S

j, q

m, S

k/E/N, L/R/N)

( current state qi , symbol scanned S

j , new state q

m , print symbol S

k/erase E/none N ,

move_tape_one_square left L/right R/none N )For the remainder of this article "definition 1" (the Turing/Davis convention) will be used.

Example: state table for the 3-state 2-symbol busy beaver reduced to 5-tuples

Current state Scanned symbol Print symbol Move tape Final (i.e. next) state 5-tuples

A 0 1 R B (A, 0, 1, R, B)

A 1 1 L C (A, 1, 1, L, C)

B 0 1 L A (B, 0, 1, L, A)

B 1 1 R B (B, 1, 1, R, B)

C 0 1 L B (C, 0, 1, L, B)

C 1 1 N H (C, 1, 1, N, H)

In the following table, Turing's original model allowed only the first three lines that he called N1, N2, N3 (cf Turing in Undecidable, p. 126). He allowed for erasure of the "scanned square" by naming a 0th symbol S0 = "erase" or

Page 33: Theory of Computation

Turing machine 31

"blank", etc. However, he did not allow for non-printing, so every instruction-line includes "print symbol Sk" or"erase" (cf footnote 12 in Post (1947), Undecidable p. 300). The abbreviations are Turing's (Undecidable p. 119).Subsequent to Turing's original paper in 1936–1937, machine-models have allowed all nine possible types offive-tuples:

Current m-configuration(Turing state)


Print-operation Tape-motion Final m-configuration(Turing state)

5-tuple 5-tuplecomments


N1 qi Sj Print(Sk) Left L qm (qi, Sj, Sk,L, qm)

"blank" = S0,1=S1, etc.

N2 qi Sj Print(Sk) Right R qm (qi, Sj, Sk,R, qm)

"blank" = S0,1=S1, etc.

N3 qi Sj Print(Sk) None N qm (qi, Sj, Sk,N, qm)

"blank" = S0,1=S1, etc.

(qi, Sj, Sk,qm)

4 qi Sj None N Left L qm (qi, Sj, N,L, qm)

(qi, Sj, L,qm)

5 qi Sj None N Right R qm (qi, Sj, N,R, qm)

(qi, Sj, R,qm)

6 qi Sj None N None N qm (qi, Sj, N,N, qm)

Direct "jump" (qi, Sj, N,qm)

7 qi Sj Erase Left L qm (qi, Sj, E,L, qm)

8 qi Sj Erase Right R qm (qi, Sj, E,R, qm)

9 qi Sj Erase None N qm (qi, Sj, E,N, qm)

(qi, Sj, E,qm)

Any Turing table (list of instructions) can be constructed from the above nine 5-tuples. For technical reasons, thethree non-printing or "N" instructions (4, 5, 6) can usually be dispensed with. For examples see Turing machineexamples.Less frequently the use of 4-tuples are encountered: these represent a further atomization of the Turing instructions(cf Post (1947), Boolos & Jeffrey (1974, 1999), Davis-Sigal-Weyuker (1994)); also see more at Post–Turingmachine.

The "state"The word "state" used in context of Turing machines can be a source of confusion, as it can mean two things. Mostcommentators after Turing have used "state" to mean the name/designator of the current instruction to beperformed—i.e. the contents of the state register. But Turing (1936) made a strong distinction between a record ofwhat he called the machine's "m-configuration", (its internal state) and the machine's (or person's) "state of progress"through the computation - the current state of the total system. What Turing called "the state formula" includes boththe current instruction and all the symbols on the tape:

Thus the state of progress of the computation at any stage is completely determined by the note of instructionsand the symbols on the tape. That is, the state of the system may be described by a single expression(sequence of symbols) consisting of the symbols on the tape followed by Δ (which we suppose not to appearelsewhere) and then by the note of instructions. This expression is called the 'state formula'.—Undecidable, p.139–140, emphasis added

Earlier in his paper Turing carried this even further: he gives an example where he places a symbol of the current "m-configuration"—the instruction's label—beneath the scanned square, together with all the symbols on the tape

Page 34: Theory of Computation

(Undecidable, p. 121); this he calls "the complete configuration" (Undecidable, p. 118). To print the "completeconfiguration" on one line he places the state-label/m-configuration to the left of the scanned symbol.A variant of this is seen in Kleene (1952) where Kleene shows how to write the Gödel number of a machine's"situation": he places the "m-configuration" symbol q4 over the scanned square in roughly the center of the 6non-blank squares on the tape (see the Turing-tape figure in this article) and puts it to the right of the scanned square.But Kleene refers to "q4" itself as "the machine state" (Kleene, p. 374-375). Hopcroft and Ullman call this compositethe "instantaneous description" and follow the Turing convention of putting the "current state" (instruction-label,m-configuration) to the left of the scanned symbol (p. 149).Example: total state of 3-state 2-symbol busy beaver after 3 "moves" (taken from example "run" in the figurebelow):

1A1This means: after three moves the tape has ... 000110000 ... on it, the head is scanning the right-most 1, and the stateis A. Blanks (in this case represented by "0"s) can be part of the total state as shown here: B01 ; the tape has a single1 on it, but the head is scanning the 0 ("blank") to its left and the state is B."State" in the context of Turing machines should be clarified as to which is being described: (i) the currentinstruction, or (ii) the list of symbols on the tape together with the current instruction, or (iii) the list of symbols onthe tape together with the current instruction placed to the left of the scanned symbol or to the right of the scannedsymbol.Turing's biographer Andrew Hodges (1983: 107) has noted and discussed this confusion.

Turing machine "state" diagrams

The table for the 3-state busy beaver ("P" = print/write a "1")

Tape symbol Current state A Current state B Current state C

Write symbol Move tape Next state Write symbol Move tape Next state Write symbol Move tape Next state

0 P R B P L A P L B


The "3-state busy beaver" Turing machine in a finite state representation. Each circlerepresents a "state" of the TABLE—an "m-configuration" or "instruction". "Direction" ofa state transition is shown by an arrow. The label (e.g.. 0/P,R) near the outgoing state (at

the "tail" of the arrow) specifies the scanned symbol that causes a particular transition(e.g. 0) followed by a slash /, followed by the subsequent "behaviors" of the machine, e.g."P Print" then move tape "R Right". No general accepted format exists. The convention

shown is after McClusky (1965), Booth (1967), Hill, and Peterson (1974).

To the right: the above TABLE asexpressed as a "state transition"diagram.Usually large TABLES are better leftas tables (Booth, p. 74). They are morereadily simulated by computer intabular form (Booth, p. 74). However,certain concepts—e.g. machines with"reset" states and machines withrepeating patterns (cf Hill and Petersonp. 244ff)—can be more readily seenwhen viewed as a drawing.

Whether a drawing represents animprovement on its TABLE must bedecided by the reader for the particularcontext. See Finite state machine for more.

Page 35: Theory of Computation

The evolution of the busy-beaver's computation starts at the top and proceeds to thebottom.

The reader should again be cautionedthat such diagrams represent asnapshot of their TABLE frozen intime, not the course ("trajectory") of acomputation through time and/orspace. While every time the busybeaver machine "runs" it will alwaysfollow the same state-trajectory, this isnot true for the "copy" machine thatcan be provided with variable input"parameters".

The diagram "Progress of thecomputation" shows the 3-state busybeaver's "state" (instruction) progressthrough its computation from start tofinish. On the far right is the Turing "complete configuration" (Kleene "situation", Hopcroft–Ullman "instantaneousdescription") at each step. If the machine were to be stopped and cleared to blank both the "state register" and entiretape, these "configurations" could be used to rekindle a computation anywhere in its progress (cf Turing (1936)Undecidable pp. 139–140).

Models equivalent to the Turing machine modelMany machines that might be thought to have more computational capability than a simple universal Turing machinecan be shown to have no more power (Hopcroft and Ullman p. 159, cf Minsky (1967)). They might compute faster,perhaps, or use less memory, or their instruction set might be smaller, but they cannot compute more powerfully (i.e.more mathematical functions). (Recall that the Church–Turing thesis hypothesizes this to be true for any kind ofmachine: that anything that can be "computed" can be computed by some Turing machine.)A Turing machine is equivalent to a pushdown automaton that has been made more flexible and concise by relaxingthe last-in-first-out requirement of its stack.At the other extreme, some very simple models turn out to be Turing-equivalent, i.e. to have the same computationalpower as the Turing machine model.Common equivalent models are the multi-tape Turing machine, multi-track Turing machine, machines with inputand output, and the non-deterministic Turing machine (NDTM) as opposed to the deterministic Turing machine(DTM) for which the action table has at most one entry for each combination of symbol and state.Read-only, right-moving Turing machines are equivalent to NDFAs (as well as DFAs by conversion using theNDFA to DFA conversion algorithm).For practical and didactical intentions the equivalent register machine can be used as a usual assembly programminglanguage.

Page 36: Theory of Computation

Turing machine 34

Choice c-machines, Oracle o-machinesEarly in his paper (1936) Turing makes a distinction between an "automatic machine"—its "motion ... completelydetermined by the configuration" and a "choice machine":

...whose motion is only partially determined by the configuration ... When such a machine reaches one of theseambiguous configurations, it cannot go on until some arbitrary choice has been made by an external operator.This would be the case if we were using machines to deal with axiomatic systems.—Undecidable, p. 118

Turing (1936) does not elaborate further except in a footnote in which he describes how to use an a-machine to "findall the provable formulae of the [Hilbert] calculus" rather than use a choice machine. He "suppose[s] that the choicesare always between two possibilities 0 and 1. Each proof will then be determined by a sequence of choices i1, i2, ...,in (i1 = 0 or 1, i2 = 0 or 1, ..., in = 0 or 1), and hence the number 2n + i12n-1 + i22n-2 + ... +in completely determines theproof. The automatic machine carries out successively proof 1, proof 2, proof 3, ..." (Footnote ‡, Undecidable,p. 138)This is indeed the technique by which a deterministic (i.e. a-) Turing machine can be used to mimic the action of anondeterministic Turing machine; Turing solved the matter in a footnote and appears to dismiss it from furtherconsideration.An oracle machine or o-machine is a Turing a-machine that pauses its computation at state "o" while, to complete itscalculation, it "awaits the decision" of "the oracle"—an unspecified entity "apart from saying that it cannot be amachine" (Turing (1939), Undecidable p. 166–168). The concept is now actively used by mathematicians.

Universal Turing machinesAs Turing wrote in Undecidable, p. 128 (italics added):

It is possible to invent a single machine which can be used to compute any computable sequence. If thismachine U is supplied with the tape on the beginning of which is written the string of quintuples separated bysemicolons of some computing machine M, then U will compute the same sequence as M.

This finding is now taken for granted, but at the time (1936) it was considered astonishing. The model ofcomputation that Turing called his "universal machine"—"U" for short—is considered by some (cf Davis (2000)) tohave been the fundamental theoretical breakthrough that led to the notion of the Stored-program computer.

Turing's paper ... contains, in essence, the invention of the modern computer and some of the programmingtechniques that accompanied it.—Minsky (1967), p. 104

In terms of computational complexity, a multi-tape universal Turing machine need only be slower by logarithmicfactor compared to the machines it simulates. This result was obtained in 1966 by F. C. Hennie and R. E. Stearns.(Arora and Barak, 2009, theorem 1.9)

Page 37: Theory of Computation

Comparison with real machines

A Turing machine realization in LEGO

It is often said that Turing machines, unlike simpler automata, are aspowerful as real machines, and are able to execute any operation that areal program can. What is missed in this statement is that, because areal machine can only be in finitely many configurations, in fact this"real machine" is nothing but a linear bounded automaton. On the otherhand, Turing machines are equivalent to machines that have anunlimited amount of storage space for their computations. In fact,Turing machines are not intended to model computers, but rather theyare intended to model computation itself; historically, computers,which compute only on their (fixed) internal storage, were developedonly later.

There are a number of ways to explain why Turing machines are useful models of real computers:1.1. Anything a real computer can compute, a Turing machine can also compute. For example: "A Turing machine

can simulate any type of subroutine found in programming languages, including recursive procedures and any ofthe known parameter-passing mechanisms" (Hopcroft and Ullman p. 157). A large enough FSA can also modelany real computer, disregarding IO. Thus, a statement about the limitations of Turing machines will also apply toreal computers.

2.2. The difference lies only with the ability of a Turing machine to manipulate an unbounded amount of data.However, given a finite amount of time, a Turing machine (like a real machine) can only manipulate a finiteamount of data.

3.3. Like a Turing machine, a real machine can have its storage space enlarged as needed, by acquiring more disks orother storage media. If the supply of these runs short, the Turing machine may become less useful as a model. Butthe fact is that neither Turing machines nor real machines need astronomical amounts of storage space in order toperform useful computation. The processing time required is usually much more of a problem.

4.4. Descriptions of real machine programs using simpler abstract models are often much more complex thandescriptions using Turing machines. For example, a Turing machine describing an algorithm may have a fewhundred states, while the equivalent deterministic finite automaton on a given real machine has quadrillions. Thismakes the DFA representation infeasible to analyze.

5. Turing machines describe algorithms independent of how much memory they use. There is a limit to the memorypossessed by any current machine, but this limit can rise arbitrarily in time. Turing machines allow us to makestatements about algorithms which will (theoretically) hold forever, regardless of advances in conventionalcomputing machine architecture.

6. Turing machines simplify the statement of algorithms. Algorithms running on Turing-equivalent abstractmachines are usually more general than their counterparts running on real machines, because they havearbitrary-precision data types available and never have to deal with unexpected conditions (including, but notlimited to, running out of memory).

One way in which Turing machines are a poor model for programs is that many real programs, such as operatingsystems and word processors, are written to receive unbounded input over time, and therefore do not halt. Turingmachines do not model such ongoing computation well (but can still model portions of it, such as individualprocedures).

Page 38: Theory of Computation

Limitations of Turing machines

Computational complexity theory

A limitation of Turing machines is that they do not model the strengths of a particular arrangement well. Forinstance, modern stored-program computers are actually instances of a more specific form of abstract machineknown as the random access stored program machine or RASP machine model. Like the Universal Turing machinethe RASP stores its "program" in "memory" external to its finite-state machine's "instructions". Unlike the universalTuring machine, the RASP has an infinite number of distinguishable, numbered but unbounded "registers"—memory"cells" that can contain any integer (cf. Elgot and Robinson (1964), Hartmanis (1971), and in particularCook-Rechow (1973); references at random access machine). The RASP's finite-state machine is equipped with thecapability for indirect addressing (e.g. the contents of one register can be used as an address to specify anotherregister); thus the RASP's "program" can address any register in the register-sequence. The upshot of this distinctionis that there are computational optimizations that can be performed based on the memory indices, which are notpossible in a general Turing machine; thus when Turing machines are used as the basis for bounding running times, a'false lower bound' can be proven on certain algorithms' running times (due to the false simplifying assumption of aTuring machine). An example of this is binary search, an algorithm that can be shown to perform more quickly whenusing the RASP model of computation rather than the Turing machine model.


Another limitation of Turing machines is that they do not model concurrency well. For example, there is a bound onthe size of integer that can be computed by an always-halting nondeterministic Turing machine starting on a blanktape. (See article on unbounded nondeterminism.) By contrast, there are always-halting concurrent systems with noinputs that can compute an integer of unbounded size. (A process can be created with local storage that is initializedwith a count of 0 that concurrently sends itself both a stop and a go message. When it receives a go message, itincrements its count by 1 and sends itself a go message. When it receives a stop message, it stops with an unboundednumber in its local storage.)

HistoryThey were described in 1936 by Alan Turing.

Historical background: computational machineryRobin Gandy (1919–1995)—a student of Alan Turing (1912–1954) and his lifelong friend—traces the lineage of thenotion of "calculating machine" back to Babbage (circa 1834) and actually proposes "Babbage's Thesis":

That the whole of development and operations of analysis are now capable of being executed by machinery.—(italics in Babbage as cited by Gandy, p. 54)

Gandy's analysis of Babbage's Analytical Engine describes the following five operations (cf p. 52–53):1. The arithmetic functions +, −, × where − indicates "proper" subtraction x − y = 0 if y ≥ x2.2. Any sequence of operations is an operation3.3. Iteration of an operation (repeating n times an operation P)4.4. Conditional iteration (repeating n times an operation P conditional on the "success" of test T)5. Conditional transfer (i.e. conditional "goto").Gandy states that "the functions which can be calculated by (1), (2), and (4) are precisely those which are Turingcomputable." (p. 53). He cites other proposals for "universal calculating machines" included those of Percy Ludgate(1909), Leonardo Torres y Quevedo (1914), Maurice d'Ocagne (1922), Louis Couffignal (1933), Vannevar Bush(1936), Howard Aiken (1937). However:

Page 39: Theory of Computation

Turing machine 37

... the emphasis is on programming a fixed iterable sequence of arithmetical operations. The fundamentalimportance of conditional iteration and conditional transfer for a general theory of calculating machines is notrecognized ...—Gandy p. 55

The Entscheidungsproblem (the "decision problem"): Hilbert's tenth question of 1900With regards to Hilbert's problems posed by the famous mathematician David Hilbert in 1900, an aspect of problem#10 had been floating about for almost 30 years before it was framed precisely. Hilbert's original expression for #10is as follows:

10. Determination of the solvability of a Diophantine equation. Given a Diophantine equation with anynumber of unknown quantities and with rational integral coefficients: To devise a process according to whichit can be determined in a finite number of operations whether the equation is solvable in rational integers.The Entscheidungsproblem [decision problem for first-order logic] is solved when we know a procedure thatallows for any given logical expression to decide by finitely many operations its validity or satisfiability ... TheEntscheidungsproblem must be considered the main problem of mathematical logic.—quoted, with this translation and the original German, in Dershowitz and Gurevich, 2008

By 1922, this notion of "Entscheidungsproblem" had developed a bit, and H. Behmann stated that... most general form of the Entscheidungsproblem [is] as follows:

A quite definite generally applicable prescription is required which will allow one to decide in a finitenumber of steps the truth or falsity of a given purely logical assertion ...

—Gandy p. 57, quoting BehmannBehmann remarks that ... the general problem is equivalent to the problem of deciding which mathematicalpropositions are true.—ibid.

If one were able to solve the Entscheidungsproblem then one would have a "procedure for solving many (oreven all) mathematical problems".—ibid., p. 92

By the 1928 international congress of mathematicians Hilbert "made his questions quite precise. First, wasmathematics complete ... Second, was mathematics consistent ... And thirdly, was mathematics decidable?" (Hodgesp. 91, Hawking p. 1121). The first two questions were answered in 1930 by Kurt Gödel at the very same meetingwhere Hilbert delivered his retirement speech (much to the chagrin of Hilbert); the third—theEntscheidungsproblem—had to wait until the mid-1930s.The problem was that an answer first required a precise definition of "definite general applicable prescription",which Princeton professor Alonzo Church would come to call "effective calculability", and in 1928 no suchdefinition existed. But over the next 6–7 years Emil Post developed his definition of a worker moving from room toroom writing and erasing marks per a list of instructions (Post 1936), as did Church and his two students StephenKleene and J. B. Rosser by use of Church's lambda-calculus and Gödel's recursion theory (1934). Church's paper(published 15 April 1936) showed that the Entscheidungsproblem was indeed "undecidable" and beat Turing to thepunch by almost a year (Turing's paper submitted 28 May 1936, published January 1937). In the meantime, EmilPost submitted a brief paper in the fall of 1936, so Turing at least had priority over Post. While Church refereedTuring's paper, Turing had time to study Church's paper and add an Appendix where he sketched a proof thatChurch's lambda-calculus and his machines would compute the same functions.

But what Church had done was something rather different, and in a certain sense weaker. ... the Turingconstruction was more direct, and provided an argument from first principles, closing the gap in Church's

Page 40: Theory of Computation

Turing machine 38

demonstration.—Hodges p. 112

And Post had only proposed a definition of calculability and criticized Church's "definition", but had proved nothing.

Alan Turing's a- (automatic-)machineIn the spring of 1935 Turing as a young Master's student at King's College Cambridge, UK, took on the challenge; hehad been stimulated by the lectures of the logician M. H. A. Newman "and learned from them of Gödel's work andthe Entscheidungsproblem ... Newman used the word 'mechanical' ... In his obituary of Turing 1955 Newman writes:

To the question 'what is a "mechanical" process?' Turing returned the characteristic answer 'Something thatcan be done by a machine' and he embarked on the highly congenial task of analysing the general notion of acomputing machine.—Gandy, p. 74

Gandy states that:I suppose, but do not know, that Turing, right from the start of his work, had as his goal a proof of theundecidability of the Entscheidungsproblem. He told me that the 'main idea' of the paper came to him when hewas lying in Grantchester meadows in the summer of 1935. The 'main idea' might have either been his analysisof computation or his realization that there was a universal machine, and so a diagonal argument to proveunsolvability.—ibid., p. 76

While Gandy believed that Newman's statement above is "misleading", this opinion is not shared by all. Turing had alifelong interest in machines: "Alan had dreamt of inventing typewriters as a boy; [his mother] Mrs. Turing had atypewriter; and he could well have begun by asking himself what was meant by calling a typewriter 'mechanical'"(Hodges p. 96). While at Princeton pursuing his PhD, Turing built a Boolean-logic multiplier (see below). His PhDthesis, titled "Systems of Logic Based on Ordinals", contains the following definition of "a computable function":

It was stated above that 'a function is effectively calculable if its values can be found by some purelymechanical process'. We may take this statement literally, understanding by a purely mechanical process onewhich could be carried out by a machine. It is possible to give a mathematical description, in a certain normalform, of the structures of these machines. The development of these ideas leads to the author's definition of acomputable function, and to an identification of computability with effective calculability. It is not difficult,though somewhat laborious, to prove that these three definitions [the 3rd is the λ-calculus] are equivalent.—Turing (1939) in The Undecidable, p. 160

When Turing returned to the UK he ultimately became jointly responsible for breaking the German secret codescreated by encryption machines called "The Enigma"; he also became involved in the design of the ACE (AutomaticComputing Engine), "[Turing's] ACE proposal was effectively self-contained, and its roots lay not in the EDVAC[the USA's initiative], but in his own universal machine" (Hodges p. 318). Arguments still continue concerning theorigin and nature of what has been named by Kleene (1952) Turing's Thesis. But what Turing did prove with hiscomputational-machine model appears in his paper On Computable Numbers, With an Application to theEntscheidungsproblem (1937):

[that] the Hilbert Entscheidungsproblem can have no solution ... I propose, therefore to show that there can beno general process for determining whether a given formula U of the functional calculus K is provable, i.e. thatthere can be no machine which, supplied with any one U of these formulae, will eventually say whether U isprovable.—from Turing's paper as reprinted in The Undecidable, p. 145

Page 41: Theory of Computation

Turing's example (his second proof): If one is to ask for a general procedure to tell us: "Does this machine ever print0", the question is "undecidable".

1937–1970: The "digital computer", the birth of "computer science"In 1937, while at Princeton working on his PhD thesis, Turing built a digital (Boolean-logic) multiplier from scratch,making his own electromechanical relays (Hodges p. 138). "Alan's task was to embody the logical design of a Turingmachine in a network of relay-operated switches ..." (Hodges p. 138). While Turing might have been just initiallycurious and experimenting, quite-earnest work in the same direction was going in Germany (Konrad Zuse (1938)),and in the United States (Howard Aiken) and George Stibitz (1937); the fruits of their labors were used by the Axisand Allied military in World War II (cf Hodges p. 298–299). In the early to mid-1950s Hao Wang and MarvinMinsky reduced the Turing machine to a simpler form (a precursor to the Post-Turing machine of Martin Davis);simultaneously European researchers were reducing the new-fangled electronic computer to a computer-liketheoretical object equivalent to what was now being called a "Turing machine". In the late 1950s and early 1960s,the coincidentally parallel developments of Melzak and Lambek (1961), Minsky (1961), and Shepherdson andSturgis (1961) carried the European work further and reduced the Turing machine to a more friendly, computer-likeabstract model called the counter machine; Elgot and Robinson (1964), Hartmanis (1971), Cook and Reckhow(1973) carried this work even further with the register machine and random access machine models—but basicallyall are just multi-tape Turing machines with an arithmetic-like instruction set.

1970–present: the Turing machine as a model of computationToday the counter, register and random-access machines and their sire the Turing machine continue to be the modelsof choice for theorists investigating questions in the theory of computation. In particular, computational complexitytheory makes use of the Turing machine:

Depending on the objects one likes to manipulate in the computations (numbers like nonnegative integers oralphanumeric strings), two models have obtained a dominant position in machine-based complexity theory:

the off-line multitape Turing machine..., which represents the standard model for string-orientedcomputation, andthe random access machine (RAM) as introduced by Cook and Reckhow ..., which models the idealizedVon Neumann style computer.

—van Emde Boas 1990:4Only in the related area of analysis of algorithms this role is taken over by the RAM model.—van Emde Boas 1990:16

Page 42: Theory of Computation

Notes[1] The idea came to him in mid-1935 (perhaps, see more in the History section) after a question posed by M. H. A. Newman in his lectures:

"Was there a definite method, or as Newman put it, a mechanical process which could be applied to a mathematical statement, and whichwould come up with the answer as to whether it was provable" (Hodges 1983:93). Turing submitted his paper on 31 May 1936 to the LondonMathematical Society for its Proceedings (cf Hodges 1983:112), but it was published in early 1937 and offprints were available in February1937 (cf Hodges 1983:129).

[2] See the definition of "innings" on Wiktionary


Undecidable problemIn computability theory and computational complexity theory, an undecidable problem is a decision problem forwhich it is impossible to construct a single algorithm that always leads to a correct yes-or-no answer.A decision problem is any arbitrary yes-or-no question on an infinite set of inputs. Because of this, it is traditional todefine the decision problem equivalently as the set of inputs for which the problem returns yes. These inputs can benatural numbers, but also other values of some other kind, such as strings of a formal language. Using someencoding, such as a Gödel numbering, the strings can be encoded as natural numbers. Thus, a decision probleminformally phrased in terms of a formal language is also equivalent to a set of natural numbers. To keep the formaldefinition simple, it is phrased in terms of subsets of the natural numbers.Formally, a decision problem is a subset of the natural numbers. The corresponding informal problem is that ofdeciding whether a given number is in the set. A decision problem A is called decidable or effectively solvable if A isa recursive set. A problem is called partially decidable, semidecidable, solvable, or provable if A is a recursivelyenumerable set. Partially decidable problems and any other problems that are not decidable are called undecidable.

The undecidable problem in computability theoryIn computability theory, the halting problem is a decision problem which can be stated as follows:

Given a description of a program and a finite input, decide whether the program finishes running or will runforever.

Alan Turing proved in 1936 that a general algorithm running on a Turing machine that solves the halting problem forall possible program-input pairs necessarily cannot exist. Hence, the halting problem is undecidable for Turingmachines.

Page 46: Theory of Computation

Relationship with Gödel's incompleteness theoremThe concepts raised by Gödel's incompleteness theorems are very similar to those raised by the halting problem, andthe proofs are quite similar. In fact, a weaker form of the First Incompleteness Theorem is an easy consequence ofthe undecidability of the halting problem. This weaker form differs from the standard statement of theincompleteness theorem by asserting that a complete, consistent and sound axiomatization of all statements aboutnatural numbers is unachievable. The "sound" part is the weakening: it means that we require the axiomatic systemin question to prove only true statements about natural numbers. It is important to observe that the statement of thestandard form of Gödel's First Incompleteness Theorem is completely unconcerned with the question of truth, butonly concerns the issue of whether it can be proven.The weaker form of the theorem can be proved from the undecidability of the halting problem as follows. Assumethat we have a consistent and complete axiomatization of all true first-order logic statements about natural numbers.Then we can build an algorithm that enumerates all these statements. This means that there is an algorithm N(n) that,given a natural number n, computes a true first-order logic statement about natural numbers such that, for all the truestatements, there is at least one n such that N(n) yields that statement. Now suppose we want to decide if thealgorithm with representation a halts on input i. We know that this statement can be expressed with a first-orderlogic statement, say H(a, i). Since the axiomatization is complete it follows that either there is an n such that N(n) =H(a, i) or there is an n' such that N(n') = ¬ H(a, i). So if we iterate over all n until we either find H(a, i) or itsnegation, we will always halt. This means that this gives us an algorithm to decide the halting problem. Since weknow that there cannot be such an algorithm, it follows that the assumption that there is a consistent and completeaxiomatization of all true first-order logic statements about natural numbers must be false.

Examples of undecidable problemsUndecidable problems can be related to different topics, such as logic, abstract machines or topology. Note thatsince there are uncountably many undecidable problems, any list is necessarily incomplete.

Examples of undecidable statementsThere are two distinct senses of the word "undecidable" in contemporary use. The first of these is the sense used inrelation to Gödel's theorems, that of a statement being neither provable nor refutable in a specified deductive system.The second sense is used in relation to computability theory and applies not to statements but to decision problems,which are countably infinite sets of questions each requiring a yes or no answer. Such a problem is said to beundecidable if there is no computable function that correctly answers every question in the problem set. Theconnection between these two is that if a decision problem is undecidable (in the recursion theoretical sense) thenthere is no consistent, effective formal system which proves for every question A in the problem either "the answer toA is yes" or "the answer to A is no".Because of the two meanings of the word undecidable, the term independent is sometimes used instead ofundecidable for the "neither provable nor refutable" sense. The usage of "independent" is also ambiguous, however.It can mean just "not provable", leaving open whether an independent statement might be refuted.Undecidability of a statement in a particular deductive system does not, in and of itself, address the question ofwhether the truth value of the statement is well-defined, or whether it can be determined by other means.Undecidability only implies that the particular deductive system being considered does not prove the truth or falsityof the statement. Whether there exist so-called "absolutely undecidable" statements, whose truth value can never beknown or is ill-specified, is a controversial point among various philosophical schools.One of the first problems suspected to be undecidable, in the second sense of the term, was the word problem forgroups, first posed by Max Dehn in 1911, which asks if there is a finitely presented group for which no algorithmexists to determine whether two words are equivalent. This was shown to be the case in 1952.

Page 47: Theory of Computation

Undecidable problem 45

The combined work of Gödel and Paul Cohen has given two concrete examples of undecidable statements (in thefirst sense of the term): The continuum hypothesis can neither be proved nor refuted in ZFC (the standardaxiomatization of set theory), and the axiom of choice can neither be proved nor refuted in ZF (which is all the ZFCaxioms except the axiom of choice). These results do not require the incompleteness theorem. Gödel proved in 1940that neither of these statements could be disproved in ZF or ZFC set theory. In the 1960s, Cohen proved that neitheris provable from ZF, and the continuum hypothesis cannot be proven from ZFC.In 1970, Soviet mathematician Yuri Matiyasevich showed that Hilbert's Tenth Problem, posed in 1900 as a challengeto the next century of mathematicians, cannot be solved. Hilbert's challenge sought an algorithm which finds allsolutions of a Diophantine equation. A Diophantine equation is a more general case of Fermat's Last Theorem; weseek the integer roots of a polynomial in any number of variables with integer coefficients. Since we have only oneequation but n variables, infinitely many solutions exist (and are easy to find) in the complex plane; the problembecomes difficult (impossible) by constraining solutions to integer values only. Matiyasevich showed this problem tobe unsolvable by mapping a Diophantine equation to a recursively enumerable set and invoking Gödel'sIncompleteness Theorem.[1]

In 1936, Alan Turing proved that the halting problem—the question of whether or not a Turing machine halts on agiven program—is undecidable, in the second sense of the term. This result was later generalized by Rice's theorem.In 1973, the Whitehead problem in group theory was shown to be undecidable, in the first sense of the term, instandard set theory.In 1977, Paris and Harrington proved that the Paris-Harrington principle, a version of the Ramsey theorem, isundecidable in the axiomatization of arithmetic given by the Peano axioms but can be proven to be true in the largersystem of second-order arithmetic.Kruskal's tree theorem, which has applications in computer science, is also undecidable from the Peano axioms butprovable in set theory. In fact Kruskal's tree theorem (or its finite form) is undecidable in a much stronger systemcodifying the principles acceptable on basis of a philosophy of mathematics called predicativism.Goodstein's theorem is a statement about the Ramsey theory of the natural numbers that Kirby and Paris showed isundecidable in Peano arithmetic.Gregory Chaitin produced undecidable statements in algorithmic information theory and proved anotherincompleteness theorem in that setting. Chaitin's theorem states that for any theory that can represent enougharithmetic, there is an upper bound c such that no specific number can be proven in that theory to have Kolmogorovcomplexity greater than c. While Gödel's theorem is related to the liar paradox, Chaitin's result is related to Berry'sparadox.Douglas Hofstadter gives a notable alternative proof of incompleteness, inspired by Gödel, in his book Gödel,Escher, Bach.In 2006, researchers Kurtz and Simon, building on earlier work by J.H. Conway in the 1970s, proved that a naturalgeneralization of the Collatz problem is undecidable.[2]

