Designing a new programming language

I’ve been working for years on a programming language called XL. XL stands for “extensible language”. The idea is to create a programming language that starts simple and grows with you.

All the work of Taodyne, including our fancy 3D graphics, is based on XL. If you think that Impress.js is cool (and it definitely is), you should definitely check out what Tao Presentations can do. It’s an incredible environment for quickly putting together stunning 3D presentations. And we are talking real 3D here, with or without glasses depending on your display technology. We can even play YouTube videos directly, as the demo below shows.

But the paradox is that the work for Taodyne keeps me so busy I’m no longer spending as much time on the core XL technology as I’d like. Fortunately, vacation periods are an ideal time to relax, and most importantly, think. So I guess it’s no chance if Christmas is one of the periods where XL leaps forward. Last year, for example, I implemented most of the type inference I wanted in XL, and as a result, got it to perform close to C on simple examples.

This year was no exception. I finally came up with a type system that I’m happy with, that is simple to explain, and that will address  the demands of Tao Presentations (notably to implement easy-to-tweak styles). The real design insight was how to properly unify types, functions and data structures. Yep, in XL, types are “functions” now (or, to be precise, will be very soon, I need to code that stuff now).

Another interesting idea that came up is how to get rid of a lot of the underlying C++ infrastructure. Basically, symbol management for XLR was currently dome with C++ data structures called “Symbols” and “Rewrites”. I found a way to do exactly the same thing only with standard XLR data structures, which will allow XLR programs to tweak their own symbol tables if they choose to do so.

An interesting rule of programming: you know you have done something right when it allows you to get rid of a lot of code while improving the capabilities of the software.

Back to coding.

15 thoughts on “Designing a new programming language

  1. When you say “types are functions”, do you mean that type constructors are functions and type attributes can be placed in the constructor object’s namespace so it can also serve as representing the type?

    I’ll look forward to the details of what you’ve been working on.🙂

  2. Kevin,

    By “types are functions”, I mean precisely that types and functions are represented identically

    In XL, a function looks like this:

        succ X -> X+1
    

    It may be defined using pattern matching using multiple declarations, e.g.

        0! -> 1
        N! -> N * (N-1)!
    

    A type declaration is something like Name:Type, for example I can force the above to only work with integer numbers with:

        0! -> !
        N:integer! -> N * (N-1)!
    

    But the big question was: what is "integer"? I wanted to make it so that I could reuse the existing mechanisms. And I did as follows:

        complex -> type complex(re, im)
    

    What this means is: "complex is a type matching any parse tree that looks like complex(re, im)." For example, it will match complex(1,2) or complex(a, b). When I write Z:complex, I basically add a test for the pattern that the input tree will match. And internally, that's exactly what "type" does: it generates a specially-crafted shape test for input trees. Specifically, it creates a "contains" test that indicates if the type contains a given value

    What is interesting is that this gives me a great freedom in how I write types. For example, I can make 1..2 a valid type by writing something like this:

        A:integer..B:integer ->
            contains X:integer -> X>=A and X false
    

    I can then use it in arguments as follows:

        foo X:1..5 -> /* I know X is between 1 and 5 here */
    

    I can even create funny types, like "even" and "odd":

        even ->
            contains X:integer -> X mod 2 = 0
            contains AnythingElse -> false
    

    Hope this makes sense. I'm currently updating the documentation, then I'll update the implementation to match.

    1. Wonderful! This is far better than I expected.🙂 “Types are functions” confused me (although types can be seen as a superset of functions), but your focus on representation (modeling) and examples helped a lot.

      I very much like the rewrite nature of your language. It leads to a more constraint- and relation-oriented view, which is what I am keenly interested in.

      I have a bunch more questions (or more like restatements in my own words to see if I’m understanding you), if that’s ok:

      (1) You wrote: “””And internally, that’s exactly what “type” does: it generates a specially-crafted shape test for input trees. Specifically, it creates a “contains” test that indicates if the type contains a given value“””

      So, “C:complex” escapes the parse tree pattern and modifies the specific Match constraints for “C” such that:

      (a) “complex -> type complex(re, im)” ensures that C matches the pattern “complex(re, im)” in the parse tree, and,

      (b) “complex -> contains XXX” ensures that C matches the constraints given by XXX.

      Is this correct?

      (2) I notice that you decline to define “integer” as a type. Is this a “primitive” which does not need a definition inside your language? Or would you define “integer” by its string representation as it would exist in the parse tree?

      (3) How do you manage equivalent model (type) formats? e.g. conversion between a binary representation of an integer and a string representation of it.

      (4) Can you elaborate on how “type” and “contains” work? Because it seems like “type” is just a specific case of “contains”. Is that right?

      The semantics of “contains” also seems a little loose between your 2 examples, but maybe I just don’t quite understand how it works, or maybe it’s because you’ve just begun.

      But I am liking this a lot and I think you are on the right track. Very exciting!🙂

      Kevin

      1. Yes, but the second one is actually a two-parter (let’s see if I understand WordPress now – Short answer: no):

        A:integer..B:integer ->
            contains X:integer -> X >= A and X <= B
            contains AnythingElse -> false
        
    1. Yup. But what is really neat is that I think I found a way to do this almost efficiently for most cases (i.e. at least the cases where you don’t specify the precondition manually). Not sure until I actually implement it, of course.

  3. Always nice to see other people working on new languages.

    Type inferencing and type unification are sort of those golden ideals everybody would like to achieve. Inevitably something always comes up and kills it though — usually it ends up being performance or compilation time.

  4. Kevin Edwards :

    So, “C:complex” escapes the parse tree pattern and modifies the specific Match constraints for “C” such that:

    (a) “complex -> type complex(re, im)” ensures that C matches the pattern “complex(re, im)” in the parse tree, and,

    (b) “complex -> contains XXX” ensures that C matches the constraints given by XXX.

    Is this correct?

    This sounds quite close, although I’m not sure what you mean with escape the parse tree pattern. Also, the expression “type X” is a way to generate the correct “contains X” tests, you don’t need both contains and type.

    (2) I notice that you decline to define “integer” as a type. Is this a “primitive” which does not need a definition inside your language? Or would you define “integer” by its string representation as it would exist in the parse tree?

    As a symbol, “integer” is a name entered in a particular context (a “compiler scope”, if you wish). So it’s as close to primitive as you can get in XL, but it can still be overridden and has no special rule attached to it. That type matches integer constants in the parse tree.

    I think what you are leading to (and that sounds like a good idea) is that there’s no reason the XL syntax configuration file could not define other such types with, say, regular expressions. Currently, XL is stuck to four terminal types: integer, real, text and name/symbol. But we were wondering if we should add a way to define, say, URLs or similar stuff. That makes a lot of sense, I need to consider that.

    (3) How do you manage equivalent model (type) formats? e.g. conversion between a binary representation of an integer and a string representation of it.

    As written above, that part is currently hard-coded in the scanner, but I think you gave me an idea on how to change that.

    (4) Can you elaborate on how “type” and “contains” work? Because it seems like “type” is just a specific case of “contains”. Is that right?

    More precisely, “type” will generate a contains that matches the given tree. The transformation is almost trivial. “type [Pattern]” is almost equivalent to:

    contains [Pattern] -> true
    contains AnythingElse -> false
    

    In reality, there is a little more to it than that, since the type also defines how arguments are “encapsulated” into parameters. I’m not sure I can make this work at any place other than with special code in the compiler, but still thinking about it.

    The semantics of “contains” also seems a little loose between your 2 examples, but maybe I just don’t quite understand how it works, or maybe it’s because you’ve just begun.

    The real reason is that WordPress ate a whole chunk of the source code, and what was left didn’t make sense. It’s really (I’ll try to get my ampersand-lt; and ampersand-gt; right this time):

    A:integer..B:integer ->
        contains X:integer -> X>=A and X<B
        contains AnythingElse -> false
    

    Hope it makes more sense like this.

  5. This sounds quite close, although I’m not sure what you mean with escape the parse tree pattern. Also, the expression “type X” is a way to generate the correct “contains X” tests, you don’t need both contains and type.

    Good to hear. As I understand it:

    – A normal pattern “N!” parses to “!(N)” and the default Matcher structurally matches against that AST pattern where “!” matches “!”, and “N” matches anything (we know this because it is capitalized?).

    – “C:complex” parses to “:(C, complex)” but that is not really the pattern you want to match against — it is a meta pattern that describes the matching process itself. i.e. “:” is not part of the syntax being defined.

    That’s the sense in which I meant that “:” escapes the parse tree pattern. Does that sound right?

    The real reason is that WordPress ate a whole chunk of the source code, and what was left didn’t make sense. It’s really (I’ll try to get my ampersand-lt; and ampersand-gt; right this time):

    Argh, I see, that is an annoying WordPress quirk. Let me know if you’d prefer to discuss this in a different forum.

        even ->
            contains X:integer -> X mod 2 = 0
            contains AnythingElse -> false
    
        A:integer..B:integer ->
            contains X:integer -> X >= A and X <= B
            contains AnythingElse -> false
    

    Hope it makes more sense like this.

    That helps, but I’m still a little confused. For example, the “A..B” type has “contains X:integer …” but what does that “X:integer” match against?

    In the “even” type definition, “contains X:integer …” seems to match against “N:even”, but in the “A..B” type, “X:integer” doesn’t match against anything — it’s like you switched to using “contains” as a “let”.

    That’s what I meant by the semantics seeming loose — but maybe I just don’t understand what the rules are for “contains” or the LHS of “->”. Could you explain them to me?

    I’m heartened to be a source of good ideas for you! I hope to elaborate on type equivalence at some point and delve into XLR’s source code as time permits. Until then, I truly appreciate your taking the time to explain it to me.🙂

    Kevin

    1. Kevin Edwards :

      That helps, but I’m still a little confused. For example, the “A..B” type has “contains X:integer …” but what does that “X:integer” match against?

      In the “even” type definition, “contains X:integer …” seems to match against “N:even”, but in the “A..B” type, “X:integer” doesn’t match against anything — it’s like you switched to using “contains” as a “let”.

      The “contains” predicate is defined the usual way, i.e. there’s some pattern matching in it already.

      To answer the question about the meaning of the right arrow, it reads as: “rewrites as”. For example, X->0 reads as “X rewrites as 0”. There can be multiple such rewrites, which are tested in order until a match is found. The meaning of “X:integer” is: “X must have the shape defined by integer”.

      So the way to read the contains definition for “even” in the rewrite “foo X:even -> bar X” is the following:

      1) The pattern “foo X:even” matches only when X passes the type check for “even”

      2) The type check for “even” is equivalent to testing “contains X” in the context fo “even”

      3) In that context, we have two possible rewrites for contains X: the first one matches integer numbers, and returns true only if the number is even. The second one matches anything else and returns false.

      So now if you write “foo 3”, we evaluate “contains 3” in the context of even. The rewrite for “contains X:integer” applies, so we return “true”, the test passes, and the pattern “foo X:even” is considered acceptable. For “foo 4”, the same happens except that the rewrite for “contains X:integer” returns “false”, so the pattern “foo X:even” fails. Finally, if you write for example “foo 3.5”, then “contains X:integer” does not match because 3.5 is not an integer, so we take “contains AnythingElse”, which returns false, so we say that it’s not “even”.

      Does that make sense?

      Also, I’m going to post a pointer to the updated XLR reference. Normally, it’s at http://xlr.sourceforge.net/xlr-ref, but I’m suffering from the Drupal “white page of death” when I try to edit it.

  6. Yes, that makes perfect sense. Thanks for the kind and detailed explanation. I apologize, it was actually my question that didn’t make sense.

    I think the main reason I was confused is that “contains” is implicitly targeting the “N” in “N:even”, even though “even” is a rewrite rule with a constant LHS. I’d like to see “N” be a more explicit input to “even” and thus “contains”.

    What do you think about making types even more like functions with an explicit parameter that could be matched? e.g.:

        even X -> integer X and X mod 2 = 0
        
        (integer A..integer B) X -> X >= A and X < B
    

    So, types are essentially defined as a composition of constraints.

    I know that might not look quite right and it actually has many implications (e.g. for “and”, “=”, “>=”, and “<” above), but it is approaching a significant goal of mine which is for questions and assertions to be based upon the same model.

    e.g. “even(X)” would generally assert that X is even, while “?(even(X))” would try to match “even” against “X” and return true if “X” was even. We can, of course, come up with better syntax and use “:”, etc.

    I’ll look forward to reading your updated XLR reference! Maybe you can include a date inside and even in the file name so I can easily track which version I’m reading.

  7. I hope what I wrote is understandable. Please let me know if it isn’t clear. Perhaps I should mention that those type rewrite rules are still meta-level. i.e. in order to apply them positionally, you’d still need to escape the parse tree pattern (as in “X:even”) or apply them in the “where” guard of a rewrite rule.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s