John Backus passed away on March 17th. He was best known as the father of FORTRAN, the first high-level programming language. A video was made for the 25 years of FORTRAN, showing the various faces behind this revolution in programming. John Backus also invented a notation, known as the Backus-Naur Form, which is still in use today to formally describe the grammar of most programming languages.
He received the 1997 Turing Award for this work. On this occasion, he gave a lecture entitled Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs. This lecture is considered a classic. It is also remarkable because John Backus advocates a functional style of programming that is very different from the style of imperative languages he made popular.
This lecture really contains two different arguments: one about computer languages, and one about computer hardware architecture. The argument about hardware architecture is that what we now call a data bus is really a bottleneck (Backus calls it the “von Neumann bottleneck”). This is one of the key points for justifying a different, more mathematical (and thus more parallel) programming paradigms.
However, I am not really convinced by the kind of notation that Backus advocated. Let’s consider the inner product example he gave. The “von Neumann” programming style he showed was:
c := 0 for i := 1 step 1 until n do c := c + a[i] * b[i]
The suggested functional replacement would look something like:
def InnerProduct = (Insert +) o (ApplyToAll *) o Transpose
In that second code, o denotes functional composition (this is a notation often used in mathematics). The notation Transpose corresponds to the operation of switching lines and columns. The notation ApplyToAll * indicates that you apply the multiplication to each pair of numbers. The notation Insert + indicates that you insert an addition between the resulting adjacent elements.
As terse and parallelizable as this notation might be, it is not at all the way I think about the mathematical inner product. For one, the Transpose that Backus refers to is not the mathematical transposition that appears in one of the possible formulations, where you use a matrix multiplication. The mathematical definition would look more like:
def [X|Y] = tX M Y
where the notation tX denotes the transposition of a vector. By contrast, the transposition Backus uses would transpose both X and Y (see the lecture for details).
Backus also objects to the use of names in the von Neumann style. This argument seems ill founded. First of all, there is a need for naming even with functional composition: o or the defined InnerProduct are names. Second, the mathematical notation very often makes use of variables. It is well understood that when you write f(x)=sin(x)*cos(x), the names x is irrelevant, and that same f can be defined as f(y)=sin(y)*cos(y). Therefore, mathematically, it makes sense to talk about the function f without making any reference to x or y.
However, the function f itself needs to be named. In mathematics like in programming, the name of the function itself can be a variable, like when I define f o g. In practice, when I want to define f o g operationally, I usually need to introduce again a variable x, like: (f o g) (x) = f(g(x)).
I would go further. Names simplify the notation. I find it much clearer to define the inner product as I did above than the variable-free way Backus suggests. Specifically, I find tX M Y to be clearer than (Insert +) o (ApplyToAll *) o Transpose. Maybe that’s just me. But from a concept programming point of view, this means I would favor code that performs the inner product using this kind of matrix multiplication.
Here is how I could do that in XL:
generic [M : matrix := identity] function InnerProduct(X, Y : vector) return scalar written [X|Y] is return transpose(X) * M * Y
From that code, the compiler may be able to deduce, through successive program transforms, that for the common case of identity matrix, the inner product code will end up being equivalent to the following:
function InnerProduct(X, Y : vector[1..5] of real) return real written [X,Y] is result := 0.0 for I in 1..5 loop result += X[I] * Y[I]
But the code above is not what the programmer would write. It’s a program derivation that happens several abstraction layers below what the programmer accesses. Each step of the derivation is precisely specified. This is not the concern of the programmer who wrote the original code, but the concern of some compiler plug-in designer.
Regarding names, the naming rules of XL ensure that I precisely know the scope of X or Y (and therefore that the exact spelling does not matter). Names, here, are not an impediment to understanding the process, they are a way to better understand what we refer to.
More importantly in Backus’ reasoning, nothing in the program above forces the program to go through the von Neumann bottleneck. It is possible (and desirable) that the representation for matrix and vector are efficient, sparse, and that the matrix multiplication is parallelized like crazy or takes advantage of vector units if they exist.
In conclusion, imperative programming languages now can benefit from the high-level of abstraction that Backus thought was reserved to the functional programming style.