Building all the major open-source web browsers

Mozilla Firefox, Chromium (the open-source variant of Chrome) and WebKit (the basis for Safari) are all great examples of open-source software. The Qt project has a simple webkit-based web browser in their examples. So that’s at least four different open-source web browsers to choose from.

But what does it take to actually build them? The TL;DR answer is that these are complex pieces of software, each of them with rather idiosyncratic build systems, and that you should consider 100GB of disk space to build all the browsers, a few hours of download, and be prepared to learn lots of new, rather specific tools. Continue reading

LLDB is a piece of crap (update: maybe it’s clang) (update 2: it’s actually ccache)

I’ve been really trying to use LLDB for a while now. Not that I really want to, but Apple went out of its way to make sure I had little choice. Not only is LLDB the default on MacOSX now, but GDB is really hard to make work on that platform as well. Can you imagine you have to generate a digital signature?

The first thing I don’t like about LLDB is its totally painful command structure. The LLDB authors published a GDB-to-LLDB conversion map, which they probably think is helpful. But to me, all it shows is that LLDB commands are more complex and more verbose than their GDB counterparts, with no obvious way to infer the LLDB command from either GDB experience, or from any kind of logic.

But the thing I dislike the most is that LLDB plain does not work, even when used with Apple tools, in a number of situations that I happen to hit practically on a daily basis. For example, it appears to be consistently unable to set breakpoints by file name and line number with command-line options that are should be used frequently enough to just work.

Here is an example session that illustrates the problem.

ddd@Marypuce tmp> cat glop.cpp
#include <iostream>

int main()
std::cerr << "Hello World\n";
ddd@Marypuce tmp> c++ -c -g glop.cpp -mmacosx-version-min=10.6 -o glop.o
ddd@Marypuce tmp> c++ -g glop.o -mmacosx-version-min=10.6 -o glop
ddd@Marypuce tmp> lldb glop
Current executable set to 'glop' (x86_64).
(lldb) b glop.cpp:5
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
(lldb) ^D
ddd@Marypuce tmp> c++ -g glop.cpp -mmacosx-version-min=10.6 -o glop
ddd@Marypuce tmp> lldb glop
Current executable set to 'glop' (x86_64).
(lldb) b glop.cpp:5
Breakpoint 1: where = glop`main + 22 at glop.cpp:5, address = 0x0000000100000e56

[Update] I initially thought the -mmacosx-version-min=10.6 option was necessary for this problem to show up. But it’s also broken without it. I ran the c++ commands with -v to see what the difference was, and apparently, it’s many little things. So separate compilation with debug symbols just does not work. OK, maybe it’s more a compiler thing than a debug thing. So maybe LLDB is not the piece of crap there. Still, that’s where the problems shows up.

This annoying bug shows how fragile this new support is. By making LLDB the default, and integrating it relatively well within Xcode, Apple is trying to slowly boil a frog here. But if you use the tools in a non-standard way, you might be burnt.

[Update 2] I incorrectly blamed clang and lldb for a problem that is actually with ccache. I am running ccache version 3.1.9 from MacPorts. If I get it out of the way, everything is back to normal. I sent a bug report email to the ccache, clang and lldb mailing lists hoping this will help someone else.

The new C++ standard has a special report on the new C++ standard, currently referred to as C++ 0x because it was due sometime between 2000 and 2009. Well, the standard was quite late compared to early expectations, so an easy joke was that we might end up with 0xA or 0xB, the C++ hexadecimal notation for 10 or 11. Ultimately, chances are that the standard will make it for 2009, so we will probably refer to it as C++ 09…

This new iteration of the language is of interest to all programmers, because it brings a number of major changes to one of the most popular programming languages today, and one that is already very complex (and therefore hard to extend). But for me in particular, it is all the more interesting to consider how various “innovations” in that new standard compare to features that play the same role in XL.


One of the major features in C++ is concepts. DevX has a dedicated article about concepts. In short, concepts in C++ are a way to describe categories of templates, and to help the compiler figure out what the programmer intended for a given template. This new aspect of the language makes it easier to define a real contract between the users of a template and its implementers.

C++ concepts, however, are somewhat annoying to me. One reason is that XL has been for a long time based on an approach that I dubbed concept programming. Concept programming, in the XL sense, is about the relationship between concepts that exist only in our head, and concept representations that exist in the computer. The key idea is to make sure that implementations look and feel like the concepts they represent.

One key consequence of that idea is that a programming language should comfortably support arbitrary concepts, not some finite set (e.g. functions or objects), because the set of concepts we manipulate is not a-priori limited. This is the key reason so much effort was put into making XL extensible.

To summarize, “concepts” in XL are only very remotely related to “concepts” in C++, although, arguably, the XL usage of the word is closer to the standard meaning.

XL generic validation = C++ concepts

Many aspects of XL are a direct consequence of the concept programming design philosophy. For example, XL implemented, since at least 2002, the idea that one can describe how a generic type can be used. This feature is called generic validation in XL terminology. I invite the reader to compare the XL implementation of a minimum function with the C++-with-concepts implementation of the same. This should convince you that the two ideas are basically almost identical.

So where are the differences between C++ concepts and XL generic validation? One of them is how the contract is being specified. In C++, you specify the kind of operators and functions that define the concept. For example, you would write something like the following to indicate that a min function requires a less-than operator:

concept LessThanComparable<typename T> {
  bool operator<(const T& x, const T& y);

template<typename T>
requires LessThanComparable<T>
const T& min(const T& x, const T& y) {
  return x < y? x : y;

In XL, by contrast, you give an example of code that has to compile with the generic type you want to validate. For example, in XL, you would write something like:

generic type ordered where
    A, B : ordered
    Test : boolean := A < B

function Min (X, Y : ordered) return ordered is
    if X < Y then
        return X
        return Y

Now, as you can see from this simple example, a significant difference is that XL considers the validation to be tied to a generic type, which can then be used to declare a function like Min directly. In other words, since you declared that ordered is generic, Min becomes implicitly generic. By contrast, in C++, LessThanComparable is a kind of predicate that applies to template classes, so you need one additional “connection” using the require statement, to let the compiler relate the T in the definition of min with the T in LessThanComparable. As a result, the C++ code for that example is more verbose and more convoluted. This becomes more visible as the code becomes more complex.

Another drawback is that the C++ concept specification as written doesn’t work for, say, int because the less-than operator in that case doesn’t have the right signature. So you need an additional concept_map in that case, making the code even more verbose, as shown below:

concept_map LessThanComparable<int> { }

One benefit of the C++ approach, however, is that the specification of the concept makes it easier to validate early that the implementation actually doesn’t require anything besides what is declared in the concept. For example, if the body of min attempts to refer to an operator that is not present in the concept specification, the compiler may detect this. Doing this with the kind of specification given in XL is much more complicated. I am considering various ways to fix this problem, which is much easier in XL since practically nobody uses it yet.

Multitasking and Threads

C++ 0x also adds standard support for threads. In my opinion, it is ironic that they manage to shoe-in support for a thread model that is so “last century”. Today, the difficult problem is not threading on a SMP system, but threading on non-uniform architectures, for example threading between a CPU and a GPU, or between the components of a Cell microprocessor, or threads that cooperate on machines with different architectures across the Internet.

This kind of problem is much more complicated, and is already, to some extent, solved by other languages such as Java or Erlang.

At this point, XL has little to offer in that space, because what is needed is not coded yet. However, I am confident that XL’s extensibility will make it easy to implement not one, but a multitude of tasking models. Among the top candidates are rendez-vous based mechanisms similar to Ada, message-passing protocols similar to Erlang, or data-driven parallelism similar to several functional languages. Stay tuned.

Variadic templates

C++ 0x will, at long last, implement variadic templates. This feature will make it possible to write functions that take a variable number of arguments, yet are type-safe.

This is, again, something that existed in XL since 2001 or earlier. You can see that XL implementation of the Max function takes advantage of this feature.

The C++ implementation is more complete, however, as it makes it possible to create not just variadic functions, but also variadic classes. This is something that is planned, but not currently implemented in XL.

Range-based iterations

A new range-based iteration mechanism was also added to C++ 0x. XL has a more general form of iteration, that already covers this specific case. Here is for example how for loops are declared in XL:

iterator IntegerIterator(
    var It : integer;
    Low, High : integer
  ) written It in Low..High is
    It := Low
    while It <= High loop
        It := It + 1

The notation It in Low..High is how you will invoke the iterator, and the yield statement in the iterator is where the body of the loop will go. The usage of the iterator is very natural:

for I in 1..5 loop
    for J in 1..I loop
        WriteLn "I=", I, " and J=", J

The benefit of this more general approach is that you can for example define two-variable iterators:

iterator MatrixIterator (
    var I : integer; LI, HI : integer;
    var J : integer; LJ, HJ : integer
  ) written I,J in [LI..HI, LJ..HJ] is
    I := LI
    while I <= HI loop
        J := LJ
        while J <= HJ loop
            J := J + 1
        I := I + 1

for A, B in [3..5, 7..9] loop
    WriteLn "A=", A, " and B=", B

You can also define iterators over any kind of data structure, using any syntax you need for this particular data structure.

Constant Expressions

C++ 0x introduces the notion of generalized constant expression. This makes it possible to declare functions that the compiler will be able to evaluate at compile time.

Once again, the XL approach is very different. The XL compiler has various phases, implemented as “plug-ins” for the compiler. One of them deals with constant folding (i.e. evaluation of constant expressions). Here is an example showing how to compute factorials at compile-time using that technique.

The XL pre-processor also makes it easy to implement compile-time assertions, something that is also a new feature of C++ 0x. The XL implementation, however, will automatically optimize a static assertion if it can evaluate the argument at compile time, instead of requiring a specific keyword.


C++ is an extremely complex language, and extending it took a lot of effort. Many of the new features have already existed in XL for a while, and are much easier to implement. However, the implementation in C++ points out some weaknesses in the way things are currently done in XL, something that is fortunately still easy to change that early in the language’s life.

Another C++ ugly feature

While telling friends about some of the nice syntactic amibiguities of C++, I noticed that the following code actually compiles on two compilers I trust, HP’s aCC (based on the highly respected EDG C++ front end), and g++ 3.4.5.

struct ripoux {
    ripoux & operator= (const ripoux &other) { return *this; }
    operator bool() { return true; }

ripoux operator *(const ripoux &a, const ripoux &b) { return b; }

int main()
    ripoux x, y, z;
    if (x * y = z)
        return 42;
    return 0;

What I find particularly distasteful with this code is that operator= is a standard assignment operator, as most C++ programmers would write it. Yet this operator can write into a so-called r-value, the result of a function call (in that case, a call to operator *), which is not returned by reference or anything that would suggest that it’s an l-value. To me, this means that the original meaning of l-value as on the left of an assignment is completely lost in C++…

I had two compilers accepting the code, and I could not find anything in the C++ standard actually forbidding it. So I checked with my personal C++ guru, Daveed Vandevoorde, and he confirmed that this was intended behavior. Another disappointment with C++, I guess.

C++0x: Hardly an improvement

I just read the article on C++0x in Wikipedia. C++0x is the “codename” for the next generation of the C++ programming language.

Well, call me a skeptic, but I’m hardly convinced by the general direction C++ has taken. It’s really too complex, there is almost one keyword per new feature. Even then, too often, the proposed syntax is simply puzzling. The C++ committee members are smart people, so there were probably good reasons related to the legacy of C++. But still…

Consider for instance the template typedef feature. The following syntax just seems bizarre to me:

template< class T > using Vector =
    MyVector< T, MyAlloc<T> > ;
Vector<int> int_vector ;

I can understand why a purist would want not to call it a “typedef”, since we are not dealing with a type but with a template. But using an equal sign to define a template, now that’s new… It’s not even consistent with any other template syntax.

In a sense, I’m really glad to no longer participate in the C++ committee. It’s not that I want to distance myself from committee members (I still have a few good friends there). But I really don’t want their job, it’s not fun to try to improve C++. It’s really tough, because the language is just too big and too inconsistent.

There are still some ideas that I like in there (with, in most cases, implementations I personally find ugly). I like to think that I had an influence on this: a few leading C++ committee members, including Bjarne Stroustrup, were very well aware of XL and concept programming. I know, because I wrote to them and argued about some XL features. This is only my opinion, Daveed Vandevoorde may not agree with me.

For example, I believe that variadic templates were not in the C++0x slides until after I sent an e-mail about the equivalent feature in XL in September 2001. Again, I may be wrong on this. It is also fairly possible that they were invented independently, if only because they had been in XL for a while (the linked test passed in August 2001, but the feature had apparently been implemented in August 2000). Actually, variadics is one case where C++0x influenced XL, because XL was using an other keyword at the time, which I replaced with ... after a discussion on the C++0x syntax.

But the case for “concepts” is a bit more annoying. I remember writing another e-mail where I tried to explain concept programming. I may very well have taken the example of what is called generic validation in XL. And maybe someone genuinely thought this was what I had in mind when I talked about “concepts”. Sadly, it is not. Concepts (in my original interpretation) are all about bridging the gap between the problem space and the code space. Specific language constructs are tools to implement concepts, they are not the concepts.

Unfortunately, I’m afraid that he future evolution of C++ will make explaining my ideas about programming much more difficult than today. Not that it was any easy today…