Friday, October 7, 2016

How we got to where we are

The recent update on the project reported that we have taken a strategic step away from the nuts and bolts of the mechanical drawings and are concentrating on marshalling all known archival sources that bear on the design of the Analytical Engines in its various forms.

Why we took this step had an unforeseen prompt.

Last year, 2015, marked the 200th anniversary of the birth of Augusta Ada, Countess of Lovelace, who, in 1843, wrote in collaboration with Babbage a much-debated account of the Analytical Engine. The bicentennial year focused attention on Lovelace’s life and work: symposia, colloquia, conferences, exhibitions and publications invited perspectives on the significance of Lovelace’s role. This required amongst other efforts a new and detailed look at her famous article.

A celebrated feature of Lovelace’s article was the inclusion of a description of how an Analytical Engine would automatically calculate Bernoulli numbers. The sequence of instructions, the presentational format of the procedures, and the representation of the changing internal states of the Engine have what we now recognise as program-like features. Nowhere does Lovelace or Babbage use the word ‘program’ but for convenience the procedures she described will continue to be referred to in what follows as ‘programs’ on the understanding that to do so is anachronistic.

The Crunch

Rainer Glaschick in Germany was invited to create a simulation of Lovelace’s Bernoulli example for public display in an exhibition at the Heinz Nixdorf MuseumsForum in Paderborn, Germany, and he embarked on the most detailed analysis to date of the algorithmic and instructional content of the example.

Rainer consulted with several others, Bernard Sufrin in Oxford, Tim Robinson in the US, me in the UK and others, to explore and clarify programming structure, underlying and implicit concepts, and hardware capability.

In the course of this Rainer raised searching questions about the computational requirements of the Bernoulli calculation and this prompted further questions about whether the Analytical Engine as designed by the early 1840s would have been capable of calculating the Bernoulli series as Lovelace described.

What emerged from Rainer’s analysis was a clearer understanding of Lovelace’s program. However, the question of whether or not the Analytical Engine of the early 1840s was capable of executing the calculation, and managing the results, pushed to the limits our understanding of the internal control of the Analytical Engine, specifically the question of the basic control arrangements between the punched cards of which there are several types, the processor (‘the Mill’) and memory (the ‘Store’).

Our inability to definitively answer aspects of the control arrangements was partly what prompted us to change our investigative strategy.

Why is Lovelace’s Bernoulli example so difficult?

A taster of what Rainer’s exploration raised for us was the question of memory management.

The Bernoulli numbers form a series in which each new term is generated by a repeated set of operations.

Instructions are coded on Operation Cards – punched cards that specify the instruction to be performed. Associated with each Operation Card are Variable Cards that indicate where the operand is to be found in the Store, and where in the Store a result is to be placed. The relationship between an Operation Card and its Variable Cards is ‘hardwired’ i.e. the association between them is fixed and unalterable.

The Analytical Engine could automatically wind back the set of Operation Cards and rerun them in a loop a prescribed number of times. Since the Variable Cards contain a form of absolute addressing, rerunning the same physical set of Operation Cards will place each new result in the same location in memory i.e. each new result overwrites each last result and, unless printed as they are generated, intermediate results are lost.
For the Bernoulli calculation all prior results in the series are required for the computation of each new result. So results need to be retained in separate locations in the Store for later systematic retrieval.

Lovelace’s Bernoulli program describes looped iteration that does not overwrite the results register i.e. each new result is placed in a separate incremented location in the Store. Without a form of indexed or relative addressing it is not evident how this could have been done by rerunning the same set of cards and we have found no evidence in the design drawings for a relative or indexed addressing capability. The late Allan Bromley, whose work on the designs is the most detailed and penetrating to date, surmised that this is something that Babbage did not resolve.

One way of automatically computing a recurrent series and retaining all intermediate results would be to repeat the sequence of operations by feeding in physically repeated sets of cards but associate with each set a different Variable Card to direct each new result to a new location in the Store. This would allow automatic calculation of the Bernoulli sequence. But it is clear that Lovelace wished for a general solution as a demonstration of the complete generality of the Analytical Engine’s capabilities – enter n as the number of Bernoulli numbers needed and press ‘Go’ for the automatic generation of all n Bernoulli numbers in the series using the same looped set of Operation Cards.

Between 1837 and 1840 Babbage wrote some two dozen programs at least two of which involve recurrence relationships of this kind i.e. the calculation of series requiring the repetition of the same sequence of operations to generate each next term. In the two main examples each new result is generated only from the immediately prior result, and each new result overwrites its immediate predecessor. So if the nth term of a series is required the instruction set would be repeated n times and prior results, other than the (n-1)th are not material. If the prior intermediate results are needed they can be printed as they are generated, but they do not need to be retained internally in the Store to continue the series. Such recurrence relations can be computed by rerunning the same physical set of Operations and Variable Cards and overwriting each last result with each new one.

So while Babbage had described how the Engine would deal with some recurrence relationships he provided no example of a recurrence relationship that required the internal retention and reuse of all prior results. Without a systematic form of incremental addressing it is unclear that Lovelace’s Bernoulli program could run on an Analytical Engine of the early 1840s. This is part of the reason why Lovelace’s Bernoulli example is a more challenging example than any of Babbage’s earlier examples of programs.

Being unable to definitively resolve the arrangements for memory management in a decisive response to Rainer’s questions prompted us to realign our immediate objectives and to redirect our attentions to systematically master what was knowable about the Analytical Engine design – this by establishing a comprehensive database for all known technical sources, ensure that the database searchable and, perhaps most importantly given Babbage’s sometimes fragmented way of working, to internally cross-reference the material. This is a substantial task given the size of the archive but the outcome will allow a systematic analysis and reveal to what extent existing sources constitute a coherent description of a buildable machine.

Rainer Glaschick has made public his analysis of Lovelace’s Bernoulli example. His full account, Ada Lovelace's Calculation of Bernoulli's Numbers, can be found here.

Doron Swade
October 6, 2016

14 comments:

  1. Is it not possible, even probable that this one specific "program" is hypothetical with respect to the capabilities of the Engine at that point? Could not Lovelace have been writing for an extension of the device which did not yet exist?

    ReplyDelete
    Replies
    1. If so then Ada invented the array - which would be a major point in her favor.

      Delete
  2. Is it not appropriate that the first "program" would need debugging?

    ReplyDelete
    Replies
    1. It's not the code that needs debugging. If this report is correct, then the Analytical Engine would have been incapable (even in principle) of calculating the Bernoulli sequence. As I read it, the machine couldn't do "arrays" - which are fundamental to an enormous range of algorithms.

      Delete
  3. I suppose, if the machine could skip a card (an "if" statement) - then you can simulate indexed addressing (painfully):

    // b = a[x]
    if ( x == 0 ) b = a0 ;
    if ( x == 1 ) b = a1 ;
    if ( x == 2 ) b = a2 ;
    ...

    // a[x]=b
    if ( x == 0 ) a0=b ;
    if ( x == 1 ) a1=b ;
    if ( x == 2 ) a2=b ;
    ...

    Painful - but possible?

    ReplyDelete
    Replies
    1. Funny you should mention that. I wrote an email on 12 December 2015 to the team that says:

      "One of the problems talked about at the Lovelace Symposium which I've been watching via live stream (brought up at least by Sufrin) is that the AE does not appear to have any way to use a value in a variable to 'address' another variable.

      To be clear I mean that imagine the V0 (variable column 0) contains the number 2. There's no way to refer to the number that is in V2 (you can swap 2 for any other number).

      I don't believe this is a fundamental limitation of the machine and it can easily be overcome. It requires that the machine have the ability to recognize when a variable has underflowed (or reached zero), a way to decrement a value in a variable, a way to copy one variable to another and a way to skip a fixed number of operation cards. I believe the engine is capable of all those things.

      It's not uncommon in assembly language programming to work with a small number of 'registers'. I'm going to think of columns V0 and V1 in the engine as my registers and do operations on them. So we'll do all operations (+, -, etc.) on V0 and V1 and put results in V2. That's easy to do with the engine.

      Now to work with other values in the store it's enough to move them to V0 and V1 (which the engine can do), perform the operation and move the result back to the appropriate place.

      So... how do you move an arbitrary column Vn to V0 (for example; same thing applies for V1) where the n is a value stored in some other variable. Let's for the purpose of illustration suppose that n is stored in V3 and that this particular machine has a store with ten variable (V0 to V9) and that n could be the value 4, 5, 6, 7, 8, 9 (because those variables are not being used for 'registers' or my 'index' in V3).

      How can we use the value in V3 to select one of V4, V5, ..., V9 and copy its value to V0? If we can do that we can then do operations on arbitrary 'indexed' locations.

      It's easy.

      I'm going to invent some operation cards using my own notation.

      COPY - Copies from one variable to another. e.g. COPY V3, V0 copies the value in V3 to V0.

      DECREMENT - Reduces the value of a variable by 1, e.g. DECREMENT V6 will decrement V6.

      SKIPZ - Skip forward a fixed number of cards if the previous operation hit zero. SKIPZ 4 would skip over 4 cards.

      SKIP - Unconditional skip over a fixed number of cards. SKIP 4 skips 4 cards.

      The following sequence of cards uses the value in V3 as an 'index' and select a variable and copy it to V0.

      DECREMENT V3
      DECREMENT V3
      DECREMENT V3
      DECREMENT V3 ; 4
      SKIPZ 10
      DECREMENT V3 ; 5
      SKIPZ 10
      DECREMENT V3 ; 6
      SKIPZ 10
      DECREMENT V3 ; 7
      SKIPZ 10
      DECREMENT V3 ; 8
      SKIPZ 10
      DECREMENT V3 ; 9
      SKIPZ 10
      COPY V4, V0
      SKIP 9
      COPY V5, V0
      SKIP 7
      COPY V6, V0
      SKIP 5
      COPY V7, V0
      SKIP 3
      COPY V8, V0
      SKIP 1
      COPY V9, V0

      This is to use a modern term a sort of 'jump table'.

      There are many different ways to achieve and it's laborious, but it would work."

      Delete
  4. Fascinating work, and I hope you manage to resolve enough unsolved questions -- to essentially build a mechanical method of executing assembly language -- via a best-possible-effort reproduction of an Analytical Engine! I remember my days of SuperMon64 on a Commodore 64, and to have a mechanical machine essentially execute opcodes, is rather damn impressive. Keep it up!

    ReplyDelete
  5. I guess this is unknowable(?) but I rather think that if Babbage had built an AE that when attempting to construct the Bernoulli program he would have realised how inelegant it was and found a more elegant (relative addressing?) solution. He spent comparatively little time on programs, so I wonder if he just wasn't aware of what was required? Alternatively maybe he was fully aware (e.g. from Lovelace's notes) but thought it was easy to fix and never got around to it? In short, would Babbage really have considered his work done if there wasn't an elegant way to implement the Bernoulli numbers program?

    ReplyDelete
  6. My interpretation of what I read here, is that around 1840 the design had no looping, and thus did not need addressing. Every step in every loop had it's own card. Ada rapidly realised that adding looping was desirable, and used it without solving the other problems it raised.

    ReplyDelete
  7. Even if the archival research shows no evidence that Babbage designed variable indexing sufficient for Lovelace's Bernoulli algorithm, please don't be discouraged from moving towards an historically plausible buildable Analytical Engine if that was your intention. There are many interesting and useful calculations that don't require array indexing or memory pointers.

    The idea to compute Bernoulli number was apparently Ada's: "I want to put in something about Bernoulli’s Numbers, in one of my Notes, as an example of how an implicit function may be worked out by the engine." Perhaps Babbage didn't think hard enough about whether it was an appropriate problem for the Engine.

    ReplyDelete
  8. I know it is not really the done thing to quote oneself, but I can't resist mentioning that my paper "The Origins of Computer Programming" (IEEE Annals of the History of Computing, 16, 4 (1994), pp.6-13.), after discussing the engineering practicalities behind the notion of stored programs, as first enunciated by Eckert & Mauchly, goes on to say that the stored program concept:

    "also has strong connotations of the computer being able to construct, manipulate and then (surpassing the notion that Babbage had arrived at over a century earlier) execute its own programs, all completely automatically. With this latter view the stored program concept becomes an engineering approximation to the theoretical universal automaton that Turing had postulated in his (now) famous 1936 paper – i.e. a machine which is general-purpose in a very fundamental mathematical sense as well as in a very practical sense. Thus, given the practical requirement of replacing the Turing Machine's infinite tape by a sufficiently large store random access store, it is crucial for the computer to be able to calculate the addresses that are used to access the store, rather than only being able to use pre-calculated (i.e. fixed) addresses. This, to my mind, is a crucial characteristic of a modern stored program computer."

    It will be fascinating if Plan 25 can reveal more about Babbage & Lovelace's thinking related to this point.

    A separate issue - how advanced were the programming concepts that were to be used at the microprogram level of Babbage's machines. I don't recall ever seeing this discussed.

    Cheers

    Brian Randell

    ReplyDelete
  9. Babbage certainly did invent "indirect addressing", but not until much later and its not clear he ever worked out the details. In the Cambridge Sketchbook from 11 February 1859 there is the passage:

    [after examination of reduction of linear equations, p.265, p.267 continues on p.295] 11 Feb 1859 It appears that in the solution of n simple eq[uatio]ns the [underlined: indices] of the Variable may be so chosen that they shall follow an arithmetical law [deleted: Here] But whenever such indices follow such a law then of course the engine can easily calculate the law and thus by conveying the numbers calculated directly to the apparatus which selects the variable do away with any necessity for cards This will be the case in all calc[ulatio]ns in which the variables are symmetrical or sym[ilar]ly places [sic] Thus is the development of any function of two three or more infinite series

    ReplyDelete
    Replies
    1. Wow! Thank you Tim so much for sharing. Amazing find & so clearly expressed. You say it so matter of factly, as if this is old news but has this entry been discussed/'published' anywhere before?

      Delete
  10. Could it be possible to validate first the reference symbology on Babbage notes with onthology and entropy AI, that would tell in a sense if he was exchanging symbols for the same meaning, my guess is that this could create a foundation on the abstraction layer.
    This sound as a doctoral thesis, validating first his way of thinking / writing his thoughs
    Best regards
    Angel from Mexico

    ReplyDelete