Trends in programming languages…

A colleague sent me an interview with Kalani Thielen about trends in programming languages.

I’m fascinated by this interview. Kalani is obviously an expert. But what should we think of the following?

The function, or “implication connective” (aka “->”), is an important tool and ought to feature in any modern language.

I have a problem with this kind of jargon. Why? Because if concept programming teaches us anything, it’s that a computer function is anything but a mathematical “implication connective”. Programming languages are not mathematics.

Programming languages are not mathematics

Computer science courses often spend a lot of time teaching us about various aspects of mathematics such as lambda calculus. So we are drawn to believe that programming languages are a sub-branch of mathematics.

And indeed, this is the view that Kalani expresses:

The lambda calculus (and its myriad derivatives) exemplifies this progression at the level of programming languages. In the broadest terms, you have the untyped lambda calculus at the least-defined end (which closely fits languages like Lisp, Scheme, Clojure, Ruby and Python), and the calculus of constructions at the most-defined end (which closely fits languages like Cayenne and Epigram). With the least imposed structure, you can’t solve the halting problem, and with the most imposed structure you (trivially) can solve it. With the language of the untyped lambda calculus, you get a blocky, imprecise image of what your program does, and in the calculus of constructions you get a crisp, precise image.

The statement that I strongly dispute is the last one: In the calculus of constructions, you get a crisp, precise image. The Calculus of Constructions (CoC) is the theoretical basis for tools such as Coq. These tools are intended to assist with computer-generated proofs. So the very thing they talk about is represented crisply by the CoC. But if I want is to represent the behavior of malloc() (a function whose primary role are its side effects) or how CPUs in a system communicate with one another (physical interactions), then CoC is of no use. It doesn’t give me a crisp, precise image.

In other words, high-level languages with first-class functions, implicit garbage collection or dynamic typing are really cool, but they give me a blurry, imprecise picture of how the computer actually works. The reason is that a computer is not a mathematical object, it’s a physical object. An “integer” in a computer is only an approximation of the mathematical integer, it’s only an approximation of the mathematical entity with the same name.

Trying to hide the differences only makes the picture more blurry, not more crisp. For example, you can use arbitrary-precision arithmetics instead of integers with a fixed number of bits. And now, your “integers” start consuming memory and have other side effects. In any case, these arbitrary-precision numbers are not “native” to the computer, so they are “blurry”.

Programming languages are languages

With concept programming, I’ve consistently argued that programming languages are, first and foremost, languages. Trends in programming languages are a problem of linguistics, not mathematics. The goal should not be to make the language more precise, but to make it more expressive. Nobody cares if you can solve the halting problem regarding programs in your language, if to achieve that objective you have to give up expressiveness.

Let’s make an analogy with real-world languages. Imagine that we decided that it’s important to make English “precise”. We’d set the goal that any sentence in Precisenglish could be provably true or false. First, you’d end up with a language where it’s impossible to write “This sentence is false”, since it’s impossible to prove this sentence true or false. Now, this particular sentence may not seem indispensable. But what about questions? “What time is it?” wouldn’t be a valid Precisenglish sentence… Isn’t that a hefty price to pay for precision?

The same is true for programming languages. You can impose constraints on the language that make it easier to prove things. And then, simple things like side effects become really complex. In Haskell, the simple task of writing something to the console requires complex constructions such as monads

Mathematics and programming both derive from languages

It’s interesting to observe that mathematics is also a form of language. It’s a precise, if restricted, language that helps us reason about symbols, build theorems, prove things. So it makes sense that mathematics and programming languages are related: they both are forms of languages. But it’s not because programming languages derive from mathematics. It’s because programming languages and mathematics both derive from languages.

In my opinion, progress in programming languages will happen if we decide to give up mathematics and embrace linguistics. When we try to start focusing on how we translate concepts in our head into computer approximations.

A good language by that definition can adapt to mathematics as well as to other domains of knowledge. The ideal programming language should be capable of representing mathematical properties. It should be possible to write a subset of the language that is precise, just like mathematics can be seen as a precise subset of human languages. But it should also be possible to have a subset that is not necessarily as precise because it addresses other domains. There should be a subset that deals with I/Os not using monads or other abstractions, but representing what’s going on in the machine as precisely as possible. There should be a subset that deals with parallelism, or computations on a GPU, or floating-point arithmetic.

And you basically have a definition of what I’m trying to achieve with XL.

3 thoughts on “Trends in programming languages…

  1. I find it interesting that monads are usually presented as “complex”. They do look scary when you first try and read formal explanations like on the Wikipedia page. But while learning Haskell, they just come naturally from the needs of passing information from one function to the next. The “Real World Haskell” book does a great job of demystifying them.

    In any case I totally agree with your point that these high-level languages tend to hide what the computer really does in the end, and this is usually catered in big systems by having multiple levels of abstraction. If XL can help with that, all the better!

  2. You write:

    I have a problem with this kind of jargon. Why? Because […] a computer function is anything but a mathematical “implication connective”. Programming languages are not mathematics.

    First off, I completely agree with what you’ve said here, depending on your definition of a “computer function”. To be clear, some “computer functions” *do* correspond exactly to the logical interpretation of “->” (that if the input argument can be constructed, the output argument can be constructed, the values correspond 1:1, and no other funny business happens). But some don’t, and that’s exactly where other tools of logic come in (whether Hoare logic or some monadic interface or any of a large number of alternatives). In fact, that’s exactly what I was saying — that you need to look at more than just “->”.

    I also agree that programming languages are not mathematics, although I’m not completely sure what that actually means. But the semantics of languages are about logic (and there’s a rich history of logic that linguists rely on as well). This is just a fundamental fact, like the unity of electricity and magnetism — Curry-Howard shows that logic and computation have the same kind of unity.

    You also wrote:

    […] if I want is to represent the behavior of malloc() (a function whose primary role are its side effects) or how CPUs in a system communicate with one another (physical interactions), then CoC is of no use. It doesn’t give me a crisp, precise image.
    […]
    There should be a subset that deals with I/Os not using monads or other abstractions, but representing what’s going on in the machine as precisely as possible.

    This is exactly the sort of thing that I’m talking about, and CoQ (and other proof-assistants) actually *do* this sort of thing on a regular basis (check out Intel’s use of such tools, for example). You might be interested in Ben Pierce’s recent work in software foundations (with contributions from other researchers as well). I think that you might find this section particularly interesting:

    http://www.cis.upenn.edu/~bcpierce/sf/Hoare.html

  3. Kalani,

    Welcome to this blog, and thanks for the insightful comment. I appreciate the references, and while I only glanced at the link you gave me, I think I will need to spend a lot more time exploring it.

    That being said, let me address two possible misunderstandings, which I attribute largely to the obviously incorrect hypothesis I made that everybody knows what I mean, therefore I shouldn’t ever need to explain anything🙂

    First, you wrote:

    Kalani Thielen :
    To be clear, some “computer functions” *do* correspond exactly to the logical interpretation of “->” (that if the input argument can be constructed, the output argument can be constructed, the values correspond 1:1, and no other funny business happens).

    Concept programming focuses on the translation of ideas into computer artifacts. A mathematical function is a theoretical object, a production of our mind. A computer function, even in the best programming language, is a computerized approximation of that ideal theoretical object. So from the concept programming point of view, no computer function ever corresponds exactly to the mathematical theory. It adds baggage, e.g. execution time, memory usage or energy consumption. Consequently, the computer version also has a number of limitations, e.g. on the maximum size of its input, which you never need to care about when you deal with the mathematical function.

    So my original point was not that some languages call “functions” things that happen to have side effects (which I believe is how you interpreted it). Rather, it was that even something as simple as “1” in a computer program is only a remote approximation of the mathematical entity.

    Naturally, we built layer upon layer of abstractions designed to make the approximation behave like the real thing, as much as possible. Still, when you write 12+13 and the computer shows 25, so much has happened under the hood that it’s mind-boggling. Think only about decimal-binary conversions, character encoding conventions, font rendering or the implementation of addition in the CPU ALU, and you’ll get a glimpse of what I mean with “anything but”🙂 Tons of “funny business” happening here, but all designed so that you don’t see it, like any good magic.

    Second, regarding malloc(). It may ver well be that I do not fully comprehend the capabilities of CoQ. But I was under the impression that it focuses mostly on what can be proven. As you very justly pointed out in your own blog, what you can prove depends very narrowly on the restrictions you put initially.

    It just so happens that for malloc(), putting restrictions tends to very quickly diminish the interest of what you can prove.

    • it’s much easier to reason about unlimited memory, but the interesting cases arise in low-memory conditions.
    • It’s much easier to reason assuming that malloc actually allocated memory, but any interesting operating system today implements demand-paging. Any reasoning about malloc performance needs to take into account not just malloc() itself, but any pointer that could possibly reach the allocated pages, when the page accesses will happen, and the state of the virtual memory subsystem (both CPU, OS and VM backing store devices) when the accesses do happen.
    • This brings us to whether you reason with a single program, or with an entire workload sharing a same system. To be realistic, you have to consider the effect of other programs on shared resources such as physical memory, the TLB, kernel data structures, and so on.
    • Make the system multi-processor for good measure, and now you need to also reason about what other processors are doing. Does this other CPU flush your cache lines to get atomic access? Did some other process cause physical memory to be fragmented, or can you use “large pages”? Do I/O operations cause DMA to some of the memory we are talking about? And so on.
    • Based on my experience, by the time you have reached that stage, your practical ability to get a “crisp view” of malloc based on logical reasoning is completely dwarfed by what you get from actual measurements and tests. Again, this may be just because I don’t know enough about CoQ and similar tools. This may also be because I spent days, even months, trying to understand the behavior of enterprise workloads running in the virtualization environment I had designed for HP.

      Don’t get me wrong. Formal reasoning has a lot of value. So do constraints and restrictions imposed by programming languages to make common mistakes less likely or easier to detect. But I’d say that they give me a crisper view of the theoretical model. They give me a better definition of what the program is supposed to do. I am not convinced that they give me a crisper view of the actual behavior.

      If I can venture into an analogy, I’d say that formal proof technologies are to computer scientists what computer simulations are to cosmologists: indispensable to understand how galaxies form or to predict the evolution of stars, but no match to a telescope for knowing what actual galaxies do look like or to observe actual supernovas. Formal proofs and computer simulation give us a “crisp” view of a theoretical model. Actual testing and telescope observation give us a crisp view of reality.

      Does that make sense to you?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s