konrad zuse's legacy: the architecture of the z1 and...

6
--- -- .' .., .- - --:...~~~.. - Konrad Zuse's Legacy: The Architecture of the Z1 and Z3 RAUL ROJAS This paper provides a detai/ed description of the architecture of the ZI and Z3 computing machines that Konrad Zuse designed in Berlin between 1936 and 1941. The necessary basic information was obtained from a careful evaluation of the patent application Zuse filed in 1941. Additional insight was gained from a software simulation of the machine's logic. The ZI was bui/t using purely mechanical components; the Z3 used electromechanical relays. However, both machines shared a common logical structure, and their pro- gramming model was the same. I argue that both the ZI and the Z3 pos- sessed features akin to those of modern computers: The memory and proc- essor were separate units, and the processor could handle fIoating7point numbers and compute the four basic arithmetical operations as weil as the square root of a number. The program was stored on punched tape and was read sequentially. In the last section of this paper, I put the architecture of the ZI and Z3 into historical perspective by offering a comparison with com- puting machines built in other countries. Introduction K onrad Zuse is popularly recognized in Germany as the fa- ther of the computer, and his ZI, a programmable automa- ton built from 1936 to 1938, has been called the first computer in the world. Other nations reserve this honor for one of their own scientists, and there has been a long and often acrimonious debate on the issue of who is the true inventor of the computer. Some- times the discussion is preempted by specifying in full detail the technological features of a specific machine. The Electronic Nu- merical Integrator and Computer (ENIAC),for example, has been called the first "Iarge-scale general-purpose electronic computer in the world.,,2 The ENIAC was buiIt at the Moore School of Electrical Engineering of the University of Pennsylvania from May 1943 to 1945. It solved its first problem in December 1945 and was officially presented in February 1946. Another contender for the tide of the first computer is the Mark I, built by Howard Aiken at Harvard University between 1939 and 1944. The Mark I was an electromechanieal machine, not of the all-mechanical na- ture of previous computing deviees and not bujlt with the elec- tronies available at the time.1 The machine lohn Atanasoff built (later called the ABC) at Iowa' State College from 1938 to 1942 used vacuum tubes but was restricted to the addition and subtrac- tion of vectors and had a structure inappropriate for universal computation.2 In direct contrast to these three machines, the ZI was more flexible and was designed to execute a lang and modifi- able sequence of instructions contained on a punched tape. Zuse's machines, the Z3 and the Z4, were not electronic and were of reduced size. Since the Z3 was completed and was successfully working prior to the Mark I, it has been called the first program- mahle calculating machine in the worId. Of course the old debate will not be closed with this paper, but I want to show here just how advanced the machines Zuse built were when considered from the viewpoint of modem computer architecrure and com- pared with other designs of that time. The Berlin Polytechnie srudent Zuse started thinking about computing machines in the 1930s. He reaIized that he could con- struct an automaton capable of executing a sequence of arithmeti- cal operations Iike those needed to compute mathematical tables. Coming from a civil engineering background. he had no formal training in electronics and was not acquainted with the technology used in conventional mechanical caIculators. This nominal deficit worked to his advantage, however, because he had to rethink the whole problem of arithmetic computation and thus hit on new and original solutions. Zuse decided to buiId his first experimental caIculating ma- chine exploiting two main ideas: . the machine would work with binary numbers . the computing and control unit would be separated from the storage. Years before lohn von Neumann explained the advantages of a computer architecture in which the processor is separated from the memory, Zuse had already anived at the same solution. However, it must be said that Charles Babbage had the same idea in the previous century when he designed his Analytical Engine. In 1936, Zuse completed the memory of the machine he had planned. (Zuse called it the Speicherwerk (storage mecha- 1058-6 1 80197/S 10.00 e 1997 IEEE IEEE Annals ofthe History ofComputing, Vol. 19, No. 2, 1997 . 5

Upload: others

Post on 12-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Konrad Zuse's Legacy: The Architecture of the Z1 and Z3page.mi.fu-berlin.de/rojas/1996/Konrad_Zuses_Legacy.pdf · Konrad Zuse's Legacy: The Architecture of the Z1 and Z3 RAUL ROJAS

--- -- .' .., .----:...~~~.. -

Konrad Zuse's Legacy:The Architecture of the Z1 and Z3RAUL ROJAS

This paper provides a detai/ed description of the architecture of the ZI and

Z3 computing machines that Konrad Zuse designed in Berlin between 1936and 1941. The necessary basic information was obtained from a careful

evaluation of the patent application Zuse filed in 1941. Additional insight wasgained from a software simulation of the machine's logic. The ZI was bui/t

using purely mechanical components; the Z3 used electromechanical relays.However, both machines shared a common logical structure, and their pro-gramming model was the same. I argue that both the ZI and the Z3 pos-sessed features akin to those of modern computers: The memory and proc-essor were separate units, and the processor could handle fIoating7pointnumbers and compute the four basic arithmetical operations as weil as thesquare root of a number. The program was stored on punched tape and wasread sequentially. In the last section of this paper, I put the architecture ofthe ZI and Z3 into historical perspective by offering a comparison with com-puting machines built in other countries.

Introduction

Konrad Zuse is popularly recognized in Germany as the fa-ther of the computer, and his ZI, a programmable automa-

ton built from 1936 to 1938, has been called the first computer inthe world. Other nations reserve this honor for one of their ownscientists, and there has been a long and often acrimonious debateon the issue of who is the true inventor of the computer. Some-times the discussion is preempted by specifying in full detail thetechnological features of a specific machine. The Electronic Nu-merical Integrator and Computer (ENIAC),for example, has beencalled the first "Iarge-scale general-purpose electronic computerin the world.,,2 The ENIAC was buiIt at the Moore School ofElectrical Engineering of the University of Pennsylvania fromMay 1943 to 1945. It solved its first problem in December 1945and was officially presented in February 1946. Another contenderfor the tide of the first computer is the Mark I, built by HowardAiken at Harvard University between 1939 and 1944.The Mark Iwas an electromechanieal machine, not of the all-mechanical na-ture of previous computing deviees and not bujlt with the elec-tronies available at the time.1 The machine lohn Atanasoff built(later called the ABC) at Iowa'State College from 1938 to 1942used vacuum tubes but was restricted to the addition and subtrac-tion of vectors and had a structure inappropriate for universalcomputation.2 In direct contrast to these three machines, the ZIwas more flexible and was designed to execute a lang and modifi-able sequence of instructions contained on a punched tape. Zuse'smachines, the Z3 and the Z4, were not electronic and were ofreduced size. Since the Z3 was completed and was successfullyworking prior to the Mark I, it has been called the first program-

mahle calculating machine in the worId. Of course the old debatewill not be closed with this paper, but I want to show here justhow advanced the machines Zuse built were when consideredfrom the viewpoint of modem computer architecrure and com-pared with other designs of that time.

The Berlin Polytechnie srudent Zuse started thinking aboutcomputing machines in the 1930s. He reaIized that he could con-struct an automaton capable of executing a sequence of arithmeti-cal operations Iike those needed to compute mathematical tables.Coming from a civil engineering background. he had no formaltraining in electronics and was not acquainted with the technologyused in conventional mechanical caIculators. This nominal deficitworked to his advantage, however, because he had to rethink thewhole problem of arithmetic computation and thus hit on new andoriginal solutions.

Zuse decided to buiId his first experimental caIculating ma-chine exploiting two main ideas:

. the machine would work with binary numbers. the computing and control unit would be separated from thestorage.

Years before lohn von Neumann explained the advantages of acomputer architecture in which the processor is separated fromthe memory, Zuse had already anived at the same solution.However, it must be said that Charles Babbage had the sameidea in the previous century when he designed his AnalyticalEngine. In 1936, Zuse completed the memory of the machine hehad planned. (Zuse called it the Speicherwerk (storage mecha-

1058-6 180197/S 10.00 e 1997 IEEE

IEEE Annals ofthe History ofComputing, Vol. 19, No. 2, 1997 . 5

Page 2: Konrad Zuse's Legacy: The Architecture of the Z1 and Z3page.mi.fu-berlin.de/rojas/1996/Konrad_Zuses_Legacy.pdf · Konrad Zuse's Legacy: The Architecture of the Z1 and Z3 RAUL ROJAS

- --

the Z3

omput-worked

) adoptonds to

rhe Z3.sor andable ofproces-metical

~ expo-n. Theach in-

proces-"ncbro-of eachll1e 1/0unit. '-

8'a!,d

;a!y

.tabus

ZJ. The

llowingÜficandbits of

oted by: of thement isble val-ored in

11pointvention;it does;.2). sounit is

equivalent to a significand of 15 bits. However. there is a problemwith the number zero, which cannot be expressed using a nor-malized significand. The Z3 uses the convention that any signifi-cand with exponent -64 is to be considered equal to zero. Anynumber with exponent 63 is considered infinitely large. Opera-tions involving zero and intinity are treated as exceptions. and

special hardware monitors the numbers loaded in the processor in

order to set the exception flags (see below). With this convention3the smallest number representable in the memory of the Z3 is 2-6-19 62 18= 1.08 x 10 .and the largest is 1.999 x 2 =9.2 x 10 . The

arguments for computations can be entered as decimal numberson the keyboard of the Z3 (four digits). The exponent of thedecimal representation is entered by pushing the appropriate but-

ton in a row of buttons labeled -8, -7, 7. 8. The oririnal Z3could accept input only between 1 x 10-8and 9.999 x 10 . Zuse'sreconstruction of the Z3 for the Deutsches Museum in Munich

provides enough buttons for larger exponents. With this arrange-ment. the whole numerical capacity of the machine can be re-flected on the acceptable input. The same can be said of the out-put. However. the Z3 does not print the numerical results the pro-gram produces. A single number is displayed on an array of lampsrepresenting the digits from zero to nine. The largest number thatcan be displayed is 19.999. The smallest is 00001. The largestexponent that can be displayed is +8. the smallest -8.

sign significandexponent

b_14. .. .. .~ 7 bits ; 14bits

Fig. 2. The floating-point representation in memory.

Instruction SetThe program for the Z3 is stored on punched tape. One instructionis coded using eight bits for each row of the tape. The instructionset of the Z3 consists of the nine instructions shown in Table 1.

There are three types of instructions: 1/0, memory. and arithmeti-cal operators. The opcode has a variable length of two or five bits.Memory operations encode the address of a word in the lower sixbits; that iso the addressing space has a maximum size of 64words. as mentioned above.

The instructions on the punched tape can be arranged in anyorder. The instructions Lu and Ld (read from keyboard and dis-play result. respectively) halt the machine. so that the operator hasenough time to input a number or write down a result. The ma-chine is then restarted and continues processing the program.

The instruction most conspicuously absent from the instructionset of the Z3 is conditional branch;ng. Loops can be implementedby the simple expedient of bringing together the two ends of thepunched tape. but there is no way to implement conditional se-quences of instructions. The Z3 is therefore not a universal com-puter in the sense of Turing.

Number of CyclesThe Z3 is a clocked machine. Each cycle is divided into fivestages called I. II. III. IV, and V. The instruction in the punchedtape is decoded in Stage I of a cycle. The two basic arithmetical

---

operations of the machine are addition and subtraction of expo-nents and significands. The operations can be executed in the firsttbree stages of each cycle. Stages IV and V are used to preparearguments for the next operation or to write back results.

TABlE 1INSTRUCTION SET AND OPCODES OF THE Z3

The instructions implemented in the Z3 require the foIlowingnumber of cycles:

Multiplication:Division:

Square roD!:

Addition:

Subtraction:

Read keyboard:

Display output:

Load from memory:

Store 10 memory:

16 cycles

18 cycles

20 cycles

3 cycles

4 or 5 cycles. depending on the result

9 1041 cycles. depending on the exponent

9 1041 cycles. depending on the exponent

I cycleo or 1cycle

According to Zuse. the time required for a multiplication wasthree seconds. Considering that a multiplication operation needs16 cycles. one can estimate that the operating frequency of theZ3 was 16/3 ==5.33 Hz. It is a curious coincidence that the gate-level simulation of the Z3 that my students implememed using apersonal computer also required around three seconds for amultiplication.

The instruction most conspicuouslyabsent from the instruction set of the

Z3 is conditional branching.

The number of cycles needed for the read and display instruc-tions is variable. because it depends on the exponent of the argu-ments.Since the input has to be converted from decimal to binaryrepresentation. the number of multiplications needed with thefactor 10 or 0.1 is dictated by the decimal exponent (see below).

Addition and subtraction require more than one cycle because.in the case of floating-point numbers. care has to be taken to setthe size of the exponent of both arguments to the same value. Thisrequires some extra comparisons and shifting.

A number can be stored in memory in zero cycles when the re-sult of the last arithmetical operation can be redirected to the de-sired memory address. In this case. the cycle needed for the storeinstruction overlaps the last cycle of the arithmetical operation.

IEEE Annals ofthe History ofComputing. Vol. 19. No. 2. 1997 · 7

Type Instruction Description Opcode

1/0 Lu read keyboard 01 110000

Ld display result 01 111000

memory Prz load address z 11 Z6ZSZ4Z3Z2Z1

Ps z store address z 10 Z6ZSZ4Z3Z2Z1

arithmetic Lm multiplication 01 001000

Li division 01 010000

Lw square root 01 011000

LS1 addition 01 100000

LS2 subtraction 01 101000

Page 3: Konrad Zuse's Legacy: The Architecture of the Z1 and Z3page.mi.fu-berlin.de/rojas/1996/Konrad_Zuses_Legacy.pdf · Konrad Zuse's Legacy: The Architecture of the Z1 and Z3 RAUL ROJAS

01' is resetrted.

e 23 and;ain issueavailable

Ollunit ofperations'jght sidecers usedprogram-jster pair

and sig-t and thevisible toA and T"'I

'. resp,-_, put into. put intogistel' in-NS selec-

'peration.Jen Be is

::::J

J;;nificands

:. Fd, and'ntents ofe box of1 be seena 01' Ab.

.a. Ab, 01'Jf Part Bthe mul-between

Bf and Ba and a shifter between Bf and Bb. The first shifter candisplace the significand up to two positions to the right and oneposition to "theleft. This amounts to a division of Bf by foul' 01'amu!tiplication with the constant two. The second shifter can dis-p!ace the significand in Af from one to 16 positions to the rightand from one to 15 positions to the left. These shifts are neededfor addition and subtraction of floating-point numbers. Multipli-cation and division with powers of two can therefore be per-formed when the operands for the next arithmetical operation arefetched and, in this sense, do not consume time.

The number of bits used in the registers is as foliows:

AfAaAbAe

BfBaBbBe

17bits191818

7 bits888

As can be seen from this list. Ae uses one extra bit to handle theaddition of the exponents of the arguments. Part B of the proces-SOl'uses two extra bits for the significands (b_15and b_16)andmakes explicit bo, which is not stored in memory.The extra bits atpositions -15 and -16 are included to increase the precision of thecomputations. Therefore. the total number of bits needed to storethe result of an arithmetical operation in Bf is 17 bits. RegistersBa and Bb require more extra bits (ba~.bai' and bbl) to handleintermediate results of some of the numerical algorithms. In par-ticular. the square root algorithm can lead to partial computationsin Ba requiring three bits to the left of the decimal point.

The basic primitive operation of the data path is the addition 01'subtraction 01'exponents 01'significands. When the relay As 01'Bsis set, the negation of the second argument (Ab 01'Bb) is fed intothe ALU. Therefore, if the relay As is set to one. the ALU in PartA subtracts its arguments. otherwise they are added. The same istrue for Part Band the relay Bs. The constant 01'one is needed tobuild the two's complementof a number.

Assurne that two numbers with the same exponent are to beadded. The first exponent is stored in Af. the second in Ab.Since they are equal, no operation has to be performed on thisside of the machine. In Part B, the significand of the first num-bel' is stored in Bf and the significand of the second in Bb. Thefirst step consists of loading Ba with Bf by setting the relay boxFa to one. The addition is performed next. the relay Bt is set toone, and so the result Ba + Bb is assigned to Be. The relay boxFf is now set to one. and the result is stored in Bf. As one cansee. the information can move between registers and so flowthrough the data path. The computer architect has to provide thecorrect sequence of activations of the relay boxes in order to getthe desired operation. This is done in the 23 using a techniquevery similar to microprogrammirtg.

The Control Unit

Fig. 4 shows a more detailed diagram of the control unit and ofthe 1/0 panels. The circuit Pa decodes the opcode 01'the instruc-tion read from the punched tape. If it is a memory instruction.circuit Pb sets the address bus to the value of the lower six bits ofthe opcode. The contro! unit determines the correct microse-quencing of the instructions.There are special circuits for each ofthe operations in the instruction set.

-- ~~~--==-

Circuit 2 represents the panel of buttons used to enter a deci-mal number in the machine. Only one button in each of the foul'columns can be activated. The exponent is set by pressing one ofthe buttons labeled -8 to 8 in circuit K. The output display is verysimilar to the input panel. but here !amps iIluminate the appropri-ate decimal digits, the exponent of the number (circuit Q), as weilas its sign. Note that there is a fifth digit for the output (which canbe only one 01'zero).

exponent of input

PU"""""""control uni!

..'" Udis lav Dmultllic:auon MdivISion 1suateCOOI W3dd1sub SSl~n V

databus data b...

Fig. 4. The control unit and 110 panels.

Once a decimal number has been set, a data bus transmits the

digits to register Ba. and a complex series of operations is started.The decimal input must be transformed into a binary number.Thisrequires a chain of multiplications. which is longer according tothe absolute magnitude of the exponent. If the exponent is zero,the whole transformation requires nine cycles. but if it is -8, theoperation requires 9 + 4 x 8 =41 cycles.

There are a lot of details that the

engineer designing the "microprogram"must keep in mind, otherwise shortcircuits can destroy the hardware.

Microcontrol of the Z3The heart of the control unit is made up of its microsequencers.Before I describe the way they work, it is necessary to take acloser look at the chaining of arithmetical instructions in the 23.Fig. 5 shows the main idea. Each cycle of the 23 is divided intofive stages. Stages IV and V are used to move information aroundin the machine. During Stages I, 11, and III, an addi-tionlsubtraction is computed in Part A and another in Part B of the23. I call this the "execute" phase of an instruction. A typicalinstruction fetches its arguments. executes. and writes back theresult. 2use tOokgreat care to save execution time by overlappingthe fetch stage of the next instruction with the write-back stage ofthe current one. One can think of an execution cycle as consisting

of just two stages, as shown in Fig. 5, where the first two cyclesof aseries of instructions have been labeled. I have adopted thisconvention in the tabulaI' diagrams of the numerical algorithmsdiscussed later in this article.

The microsequencing is done by special contro! wheels. Thereis one for the mu!tiplicationalgorithm. another to control division,and yet another for the square root instruction. The moving armshown in Fig. 6 starts moving clockwise as soon as the control

IEEE Anna/s ofthe History ofComputing, Vol. 19. No. 2,1997 · 9

Page 4: Konrad Zuse's Legacy: The Architecture of the Z1 and Z3page.mi.fu-berlin.de/rojas/1996/Konrad_Zuses_Legacy.pdf · Konrad Zuse's Legacy: The Architecture of the Z1 and Z3 RAUL ROJAS

Jst for the. are much

I in Stages1 by corn-igit one at

~fer to theems moreother reg-le bitwise'nd partial.sters, i.e.,>at whichnputed by:n a bit is[he line is:ircuit can-16are the[he corre-

XOR bCi'carries upactiv.

: bit pOS!-

'§"bc.16

16

4'bd_16

~~RIbe.16-

ns the 23nonnally

1Iconven-Z3 solves.v and un-any arith-'cuit looksIYnumberNn, is set

to one if the number is stored in the register pair <Af,Bf>. If thenumber is stored in the register pair <Ab,Bb>, the relay Nn2is set10one. In this way. we always know if one or both of the argu-ments for an arithmetical operation are zero. Something similar isdone for any exponent of value 63 (an infinite number, accordingto the convenlion). In this case, the relays Ni, or Niz are set toone. according to the register pair in which the number is stored.

Operations involving "exceptional" numbers (zero or infinity)are perfonned as usual, but the result is overridden by the snoop-ing circuit. Assurne, for example, that a multiplication is com-puted and that the first argument is zero (Nn, is set to one). Thecompulation proceeds as usual, but in each cycle the snoopingcircuil produces the result -64 at the output of the adder of Part A.It does not matter what operations are perfonned with the signifi-cands because the exponent of the result is set to -64, and there-fore the final result is zero. Division by an infinite number can behandled in a similar manner. The Z3 can detect undefined opera-tionssuchas 0 / 0, 00 - 00, 00 / 00, and0 x 00. In all thesecases,thecorresponding exception lamp lights on the output panel, and themachine is stopped. The 23 always produces the correct resultwhen one of the arguments is zero or 00 and the other is a numberwithin bounds. This was not the case for the Zl. Zuse thought of,but did not implement, exception handling in the 21. The machinecould not correctly perfonn some computations involving zero.16

An additional circuit looks at the exponent of the result at theoutput of the exponent's adder. If the exponent is greater than orequal to 63, overflow has occurred and the result must be set to 00.

If the exponent is lower than -64, underflow has occurred and the

result must be set to zero. To do this, the appropriate relay (Nnl orNi,) is set to one.

Zuse managed to implement exception handling using just a fewrelays. This feature of the Z3 is one of the most elegant in the wholedesign. Many of the early microprocessors of the 1970s did notinclude exception handling and left it to the softWare. Zuse's ap-proach is sounder, since it frees programmers from the tedium of

checking the bounds of their numbers before each operation.

Addition and Subtraction

In order to add or subtract two floating-point numbers x and y,their representation must be reduced to the same exponent. AfterIhis has been done, only the significands have to be added or sub-tracted. If the exponents are different, the significand of thesmaller number is shifted to the right as many places as necessary(and its exponent is incremented correspondingly to keep thenumber unchanged) until both exponents are equal. It can, ofcourse, happen that the smaller number becomes zero after 17shifts to the right. .

The signs of the two numbers are compared before deciding onthe type of operation to be executed. If an addition has been re-quested and the signs are the same, the addition is perfonned. Ifthe signs are different, a subtraction is executed. If a subtractionhas been requested and the signs are different, an addition is exe-cuted. If the signs are the same, the subtraction is executed. Aspecial circuit sets the sign of the final result according to thesigns of the arguments and the sign of the partial result.

Addition and subtraction are controlled by a chain of relays(nol by a control wheel), since the maximum number of cyclesneeded is low. Fig. 8 shows the synchronization required for theaddition of two numbers. Initially, the arguments for the addition

are stored in the register pairs <Af,Bf> and <Ab,Bb>. In the firstcycle, the exponents are sublracted. In Cycle 2, the significandwith the larger exponent is loaded into register Ba, and the sig-nificand with the smaller exponent is loaded into register Bb. Thesignificand in register Bb is shifted as many places to the right asthe absolute value of the difference of the exponents (exceptionhandling takes care of the case in which the smaller number be-comes zero after Ihe shift). In Stages I, H, and ur of Cycle 2, thesignificands are added, and finally the processor tests if the resultis greater than two. If this is the case, the significand is shiftedone position to the right and the exponenl is incremented by one.Note that the test "if (Be 2:2)" in Part A of the arithmetical unit isdone after Be has already been computed in Part B during StagesI, H, and ur of Cycle 2.

Zuse managed to implement exceptionhandling using just a few relays. Thisfeature of the Z3 is one of the most

elegant in the whole design.

In the case of a subtraction, four or five cycles are needed. Fig.9 shows the synchronizalion required for a sublraction. The firsttwo cycles are almost identical to the first tWocycles of the addi-tion algorithm, but now the significands are subtracted. Cycle 3 isexecuted only when the difference of the significands is negative.The effect of Cycle 3 is just to make the significand of the resultpositive. Cycle 4 is very importam: The difference of two nor-malized significands can have many zeros in the first bit positionsto the left. The result is nonnalized by shifting Be to the left asmany places as necessary (this is done with the shifter betweenthe relay box Fd and register Bb). The number of one-bit shifts issubtracted from the exponent in Part A of the processor. In Cycle5, the result is stored in the register pair <Af,Bf>.

Hcycle I stage I exponent t sigmncand

Fig. 8. The three cycles needed for the addition algorithm. The argu-ments for the addition are stored in the register pairs <Af,Bf> and<Ab,Bb> before the operation is started.

MultiplicationThe multiplication algorithm of the 23 is like the one used fordecimal multiplication by hand, that is, it is based on repeatedadditions of the multiplicator according 10the individual digits ofthe multiplicand. At the beginning of the algorithm, the first ar-gument is stored in the register pair <Af,Bf>. The second argu-ment is stored in the r~gister pair <Ab,Bb>. The temporary regis-

IEEE Annals oftlte History ofCompttting, Vol. 19, No. 2,1997 · 11

0 1,l.III' -""..---- "

1 IV.V Aa:=Af

1.!I.lI! Ae:=Aa-Ab Be:=O+8b

2 IV,V ie (Ae2:;O)then Ab:=O, Aa,;:Af jf (Ae2:0) tbcn Ba.:=Bf. Bb:=Be (shifted)eise Aa:=O else 8&:=8., Bb:=Bf (shitted)

(Be or sr.ve shifted IAel places to the fight)

1.!I,lI! if (Be2::2) then Ae:=Aa+Ab+l Be:=B.+Bbeise Ae:=Aa+Ab

3 IV,V Af:=Ae if (8e2:2) then 8(:=B./2else B(:=9.

Page 5: Konrad Zuse's Legacy: The Architecture of the Z1 and Z3page.mi.fu-berlin.de/rojas/1996/Konrad_Zuses_Legacy.pdf · Konrad Zuse's Legacy: The Architecture of the Z1 and Z3 RAUL ROJAS

ainder is,sition tof register,ult bit iss contin-

~

a+Bba

a+Bb.

l+Bb

aa+Bb3a

ab

I. The ith~tored inle opera-

,ition to:'ter con-

< r<2.:l one is

oe posi-

Z3. Fig.) cycles

- - - -- - -

needed to compute the square root of a number. The argument forthe operation is stored in the register pair <Af,Bf>. The registerpair <Aa,Ba> is initialized to zeroes. The algorithm computes thesquare root of numbers with an even exponent. If the exponent isan odd number, the significand is shifted one place to the left, andthe exponent is decremented by one. The final exponent(computed in Cycle 19) is half this initial exponent.

11 cycl~ ~ exponen~ I sigmftcand

Fig. 11. The 18 cyc1esneeded for the division algorithm. The ith bit ofregister Bf is denoted by Bqi]. The dividend is stored in the registerpair <Af,Bf> and the divisor in <Ab,Bb> before the operation isstarted.

The main idea of the classical algorithm is to reduce the square

root operation to a division. If we want to compute the square rootof x, we need a number Q such that xIQ =Q. The result Q is buHt

sequentially by setting the ith bit to one and then testing whetherthe condition x > Q2 still holds. If this is not the case, the ith bitmust be set to zero.

Assurne that we have al ready computed from bit zero to bit-i + I of the final result. Denote by Q-i+lthe significand

o -I -i+1Q-i+1= Bf[O] x 2 + Bf[-I] x 2 + H' + Bf[-i + 1]2 ,

Bit -i is then set to q-i and it must hold that

- --

2(

.

)2

x<! Q . = Q . l + q .2-'-I -,. -I

This is true if

Q2

( Q2

)-i

( Q 2-i) 0X - -i = X - .;+1- 2 q-i 2 -i+1+ q-i <!

Define t_iusing the expression

2-iQ

2( Q

2) 2-i

(2Q 2_;)t . =x - . = x - . 1

- q- .1 + q .-I -I -,. -I -1+ -,

This can be written as

2-; 2-;+1 2-i(2Q 2

_;)t_i= t_i+1 - q-; -;+1+ q-;

where we have used the recursive definition

2-i+1( Q

2)t_i+t= X- -;+1.

Simplifying the last expression, we finally get:

t-i = 2t_;+1- q-i(2Q-i+1+ 2-iq-i)

If t-i is positive for q-i =1. we set bit -i of the final result to one,i.e., Bf[-i]:=1. If t-i is negative, we set Bf[-i]:=O. The recursivecomputation is started wirh to=X. Q-i+1represents at each step thepartial result contained in register Bf. Bit -i is tentatively set toone, and the sign of t-i is tested. The basic loop of the square rootalgorithm for bit -i has the following form:

Ba := 2 x BeBb := 2 x BfBb[-11:= 1if (Ba - Bb ~ 0) then Be := Ba-Bb, Bf[-i] := I

else Be := Ba, Bf[-i]:= 0

The square root algorithm is thejewel in the crown of the Z3.

All bits of register Bf are used for the computation of thesquare root. If the original number lies within bounds, the result isalso within bounds.

Read and Display InstructionsThe two most complex instructions of the Z3 are those related tothe input and output of decimal numbers. A decimal number offour digits entered through the keyboard is first convened into abinary integer. This is done by reading each digit sequentially,transforming it into a binary number, and storing it in the bitsBa[-IO], Ba[-I1], Ba[-12], and Ba[-13] of register Ba. The num-ber in register Ba is multiplied by 10, and the procedure is re-peated for the other digits. After four iterations, the decimal inputhas been transformed to a binary number (the exponent of thebinary representation is formed indirectly via shifts resulting frommultiplication by 10).The difficult part is handling the exponent.If the exponent e is positive, the significand has to be multiplied etimes with 10. If ir is negative, it must be multiplied Ie I timeswith 0.1. Multiplying with 10 is relatively easy: The significand inBe can be shifted one bit to the left and then stored in Ba (Le.,Ba := 2 x Be) . At the same time, Be can be shifted three places tothe left and can be stored in Bb (Le., Bb := 8 x Be). The additionof Ba and Bb then provides the desired result: the multiplication

IEEE Annals ofthe History ofComputing, Vol. 19, No. 2, 1997 · 13

11 I I I g -----n

0 1.11.111

1 IV.V Aa:=Af Ba:=BrI

1.11.111 Ae,=Aa-Ab ir (Ba-Bb 2: 0) .hen Be:=Ba-Bb, bt:=1else &:=8&. ht:=riI

Aa;:Ae Br:=O2 IV,V Ab:;aO if(bt=l) 'hen Bf(OJ:=1

8&:=2x Be

1.11.111 Ae:=Aa+Ab ir (Ba-Bb 2: 0) then Be:=Ba-Bb. b.:=1ebe Be:=B.. bt:=O

3 IV.V A=Ae ir (bt=l) 'hen Bf(-lJ,=18a.:=2)(Se

1,11.111 Ae:=Aa+Ab ir (Ba-Bb 2: 0) .hen Be:=Ba-Bb. b',=1ebe Be:=B.. b.:=O

i IV.V Aa.:=Ae ir (bt=l) .hen Bf(2-iJ:=1B=2x Be

[.11.111 Ae::Aa.+Ab Ir (Ba-Bb 2: 0) then Be:=Ba-Bb. bt:=1eise 8e:=Ba., bt:=O

:

16 IV,V Aa.::Ae Ir (bt=l) ,hen BI[-141'=18&:=2x Be

I.IIJII Ae:=Aa,+Ab Ir (Ba-Bb 2: 0) .hen Be:=Ba-Bb. b',=1ebe Be:= Ba, b.:=o

,17 IV.V iC(Bf(0l = 0) 'hen Ab:=-1 Ba:=Br, Bb:=O

1.11,111 Ae:=Aa+Ab Be::Ba,.Bb

18 IV,V Af:=Ae IC(Bf(Oj=O)tben Br:=2xBoebe BC:=Bo

Page 6: Konrad Zuse's Legacy: The Architecture of the Z1 and Z3page.mi.fu-berlin.de/rojas/1996/Konrad_Zuses_Legacy.pdf · Konrad Zuse's Legacy: The Architecture of the Z1 and Z3 RAUL ROJAS

--- -- --

times as1 until theween zero1atcan benumber iscontinues1Umberinplications

i Fig. 13.er. Notice~ansferredwhich are

,eful con-which are~ betweenIlgorithm.nt (A'

zero, theVhen it iss relay isJj, bl, andneeded.

tofreswci)0"'0

reudt)

~moli- re ..-

The box labeled "zero, infinite" below Ae represents the circuitsfor exception handling. They snoop perrnanentlyon the data bus(results of operations and data from memory) and raise the corre-spondingexception flags when needed.The shifterbelow Be is usedto displace the signiflcand one bit to the right. This provides thenorrnalizationneeded for the significandwheneverBe <:2.

Fp and Fq are the relays that control the number and directionof one-bit shifts in the shifter below the relay boxes Fc and Fa.Fh, Fi, Fk, Fl, and Fm have the same function in relation to theother shifter. Using these five bits, the numbers between -16 and15 can be represented, and this is also the range of the secondshifter. When such a shift is perforrned, the number representedby the relays Fh to Fm is transferred through the relay box Bn toregister Ab in order to modify the exponent of the result. If thenumber is shifted 10 positions to the left, then +10 is subtractedfrom the exponent of the result. Such drastic shifts are neededmostly after subtractions.

Look again at the diagram of the Z3. Everything makes sensenow and looks as conventional as any modem small floating-point processor. It is indeed amazing how Zuse was able to findthe adequate architecture right from the beginning. The Z3processor employs just 600 relays; the memory needs threetimes as much. By having to optimize the design and by havingto save hardware everywhere, Zuse was forced to think andrethink the logical structure of his machine. He was not allowedthe luxury of the almost unlimited funding allocated by the U.S.military for the development of the ENIAC or by IBM for theMark 1. He was all alone. While this may have worked to hisadvantage from the conceptual side, it mayaiso have worked tohis disadvantage, considering the negligible impact that the Zland Z3 had on the emerging U.S. computer industry after WorldWar lI.l)

The Invention of the ComputerThe main defect of the Z3 was the absence of a conditional branch

in the instruction set. When the program is stored on punchedtape, a possible fix is to include multiple tapes and a mechanismto switch between them (as was done with the Harvard Mark I).Another possibility is having a "program counter," so that the tapecan be advanced or rewound on demand.

Sometimes the dividing line between calculating machinesand universal computers is drawn by differentiating betweenmachines with externally or internally stored programs. I haveargued elsewherelo that this is not a valid criterion. An extemalprogram can work as an interpreter of numerical data. The ex-ternal program becomes a fixed part of the processor, and thedata become the program, much in the same way as a universalTuring machine works as an interpreter. I have argued that whatis needed for universal computation is a minimal instruction setand indirect addressing.11 Indirect addressing can be simulatedby writing self-modifying programs, so that the instruction setbecomes the defining criterion. A machine with enough ad-dressable memory and an accumulator and that is capable ofexecuting the instructions CLR (clear), INC (increment),LOAD, STORE, and BZ (branch if zero) is a universal com-puter. In this sense, the Zl was not a fully fledged computer, butneither were any of the other early machines. The ABC was aspecial-purpose machine for solving sets of linear equations byGaussian elimination; the Harvard Mark I lacked conditional

branching, although it featured loops; the ENIAC was not evenprogrammable through software-the building blocks had to behardwired in dataflow fashion. Conditional branching wasavailable in the ENIAC only in a limited way, and self-modifying programs were, of course, out of the question.

TABLE 2COMPARISON OF ARCHITECTURAL FEATURES

TABLE 3SOME ADDITIONAL ARCHITECTURAL FEATURES

Tables 2 and 3 show the most relevant information about theearly computing machines mentioned earlier. As should be clearfrom the tables, none of the early computing machines fulfillsall the necessary requirements for a universal computer. I alsoinclude the Mark I machine built in Manchester from 1946 to1948, because as far as I know this was the first machine to fitmy definition of a universal computer. The Mark I was builtunder the direction of F.C. Williams and T. Kilbum. This ma-

chine stored its program in random-access digital memory im-plemented with CRT tubes. All necessary instruction primitiveswere available (in modified form), and although it lacked indi-rect addressing, self-modifying programs could be written. Thefirst program ran in June 1948 and calculated the highest properfactor of a large number.7 In September 1948, Turing was ap-pointed as a reader in mathematics in Manchester and wrotesome programs for the first universal computer in the world. Hisvision of universal computation published in 1936, the sameyear in which the storage unit of the ZI was completed, had atlast become a reality. Tables 2 and 3 are emphatic: The inven-tion of the computer was a collective achievement encompass-ing two continents and 12 years.

AcknowledgmentsDeciphering the sketchy documentation available was possibleonly with the collaboration of several of my students at the Uni-versities of HaUe and Berlin. I thank Alexander Thurm and AxelBauer, who implemented a gate-level simulation of the Z3 proces-sor. We became aware of synchronization problems when thesimulation refused to run. I also thank Franz Konieczny,ReimundSpitzer, and Roland Schultes, who wrote part of a stand-alonesimulation of the processor in C. We started working on the Z3with the help of Konrad Zuse, who gladly answered our questions.It was amazing to see how, after almost 60 years, the whole de-sign of the Z3 was still in his head. UnfortUnately,Zuse died inDecember 1995 before this description of his work was ready.This paper is dedicated to his memory.

IEEE Annals ofthe History ofComputing, Vol. 19, No. 2,1997 . 15

Machine memory and conditional soft or hard self-modifying iDdirectCPU sepuated? branching? programming programs? addresaing?

Zuse', Zl v' x soft x xAtanaaoft"s v' x hard x xH.),Iark I x x soft x xENIAC x paniaUy hard x xM.Mark 1 v' v' soft v' x

.Machine interna! fixed-point or bit.sequential architecture technologycoding floating-point? arithmetic?

Z..... Zl binary 80ating 00 sequential mechanica1Atanasoff's binary 6xed.point Y" vectorized electronicH.Mark I decimal fixe<i.point 00 parallel electromechanicalENIAC decimal fixed-point 00 dataftow electronic),I.Mark 1 binary 6xed.point Y" sequential electronic