Is it normal to wait for your computer? Why should I wait 5 seconds when I click on a menu? Why does it sometimes take half a minute to open a new document? Developers, optimize your code, if only as a matter of public service! What about making it a New Year resolution?

Why is my Mac laptop slower than my iPad?

Apple cares about iPad performance

Apple cares about iPad performance

I have a serious issue with the fact that on a laptop with 8G of RAM, 1TB of hard disk, a quad-core 2GHz i7, I spend my time waiting. All the time. For long, horribly annoying pauses.

Just typing these few paragraphs had Safari go into “pause” twice. I type something and it takes ten seconds or so with nothing showing up on screen, and then it catches up. Whaaaaat? How did programmers manage to write code so horribly that a computer with a quad-core 2.6GHz i7 can’t even keep up with my typing? Seriously? The Apple II, with its glorious 1MHz 8-bit 6502 never had trouble keeping up, no matter how fast I typed. Nor did Snow Leopard, for that matter…

Even today, why is it that I always find myself waiting for my Mac as soon as I have 5 to 10 applications open, when a poor iPad always feel responsive even with 20 or 30 applications open at the same time? Aren’t we talking about the same company (Apple)? About the same core operating system (Darwin being the core of both iOS and OSX)? So what’s the difference?

The difference, folks, is optimizations. Code for iOS is tuned, tight, fit. Applications are programmed with severe hardware limitations in mind. The iPad, for instance, is very good at “pausing” applications that you are not using and recalling them quickly when you switch to them. Also, most applications are very careful in their use of resources, in particular memory and storage. Apple definitely cares about the performance of the iPad. There was a time the performance of the Mac mattered as well, but that was a long time ago.

Boiled frog syndrome : we slowly got used to desktops or laptops being slower than tablets, but it’s just plain stupid.

Lion and Mountain Lion are Dog Slow

It's obvious why they called it Lion...

It’s obvious why they called it Lion…

I’ve been running every single version of MacOSX since the Rhapsody days. Up until Snow Leopard, each release was a definite improvement over the previous version. Lion and Mountain Lion, on the other hand, were a severe step backwards…

Lion and Mountain Lion were not just loaded with features I didn’t care about (like crippling my address book with Facebook email addresses), they didn’t just break features I relied on on a daily basis (like full screen applications that works with multiple monitors, or RSS feeds). They were slow.

We are not talking about small-scale slowness here. We are talking about molasses-fed slugs caught in a tar pit, of lag raised to an art form, of junk code piling up at an industrial scale, of inefficiency that makes soviet car design look good in comparison.

And it’s not just me. My wife and my kids keep complaining that “the machine lags”. And it’s been the case with every single machine I “upgraded” to Lion or Mountain Lion. To the point where I’m not upgrading my other machines anymore.

In my experience, the core issue is memory management. OSX Lion and Mountain Lion are much worse than their predecessors at handling multiple programs. On OSX, the primary rule of optimization seems to be “grab 1GB of memory first, ask questions later.” That makes sense if you are alone: RAM is faster than disk, by orders of magnitude, so copying stuff there is a good idea if you use it frequently.

But if you share the RAM with other apps, you may push those other apps away from memory, a process called “paging“. Paging depends very largely on heuristics, and has major impact on performance. Because, you see, RAM is faster than disk, by orders of magnitude. And now, this plays against you.

Here is an example of a heuristic that I believe was introduced in Lion: the OS apparently puts aside programs that you have not been using for a long while. A bit like an iPad, I guess. On the surface, this seems like a good idea. If you are not using them, free some memory for other programs. But this means that if I go away from my laptop and the screen saver kicks in, it will eat all available RAM and push other programs out. When I log back in… I have 3GB of free RAM and a spinning beach ball. Every time. And even if the screensaver does not run, other things like backupd (the backup daemon) or Spotlight surely will use a few gigabytes for, you know, copying files, indexing them, stuff.

Boiled frog syndrome : we slowly got used to programs using thousands of Mac128K worth of memory to do simple things like running a screensaver. It’s preposterous.

Tuning memory management is very hard

Virtual Memory is complicated

Virtual Memory is complicated

The VM subsystem, responsible for memory management, was never particularly good in OSX. I remember a meeting with an Apple executive back in the times OSX was called Rhapsody. Apple engineers were all excited about the new memory management, which was admittedly an improvement over MacOS9.

I told the Apple person I met that I could crash his Mac with 2 minutes at the keyboard, doing only things a normal user could do (i.e. no Terminal…) He laughed at me, gave me his keyboard and refused to even save documents. Foolish, that.

I went to the ancestor of Preview.app, opened a document, clicked on “Zoom” repeatedly until the zoom factor was about 6400% or so. See, in these times, the application was apparently allocating a buffer for rendering that was growing as you zoomed. The machine crawled to a halt, as it started paging these gigabytes in and out just to draw the preview on the screen. “It’s dead, Jim“, time to reboot with a long, hard and somewhat angry press on the Power button.

That particular problem was fixed, but not the underlying issue, which is a philosophical decision to take control away from users in the name of “simplicity“. OS9 allowed me to say that an App was supposed to use 8M of RAM. OSX does not. I wish I could say: “Screen Saver can use 256M of RAM. If it wants more, have it page to disk, not the other apps.” If there is a way to do that, I have not found it.

Boiled frog syndrome : we have slowly been accustomed by software vendors to give away control. But lack of control is not a feature.

Faster machines are not faster

A 1986 Mac beats a 2007 PC

A 1986 Mac beats a 2007 PC

One issue with poor optimizations is that faster machines, with much faster CPUs, GPUs and hard disks, are not actually faster to perform the tasks the user expects from them, because they are burdened with much higher loads. It’s as if developers always stopped at the limit of what the machine can do.

It actually makes business sense, because you get the most of your machine. But it also means its easy to push the machine right over the edge. And more to the point, an original 1986 Mac Plus will execute programs designed for it faster than a 2007 machine. I bet this would still hold in 2013.

So if you have been brainwashed by “Premature optimization is the root of all evil“, you should forget that. Optimizing is good. Optimize where it matters. Always. Or, as a colleague of mine once put it, “belated pessimization is the leaf of no good.”

Boiled frog syndrome : we have slowly been accustomed to our machines running inefficient code. But inefficiency is not law of nature. Actually, in the natural world, inefficiency gets you killed. So…


VMworld 2007

VMworld 2007 was held in the Moscone center in San Francisco on September 11-13. You can access all the keynote sessions here. Attendance was in the 10,000 range this year, as opposed to 1500 a couple of years ago. Virtualization is becoming “mainstream”.

Diane Greene’s keynote

Diane Greene, VMware’s co-founder and CEO, gave the initial keynote. On a personal note, I had not seen her for about 5 or 6 years, and it was a kind of painful reminder of how much time flew since the beginnings of HP Integrity VM

The big deal, in my opinion, was the announcement of VMware ESX 3i. This is basically “VMware in your pocket”, at least as far as demos are concerned. That is not something that you care about much in the data center, what you care about is facilitating deployment, and the demo pretty conclusively showed the benefits. Essentially, you unpack the server, and in a few minutes, it’s ready to run virtual machines.

VMware is certainly not the first one to use a standalone Hypervisor. The IBM Power 5 hypervisor has essentially been architected like that for a while. Just like for IBM, ESX 3i is intended to become a part of the firmware of the machines. Once you say that, it becomes obvious why VMware is interested in doing that: it’s all about control. With all the talk about Microsoft Viridian, i.e. Microsoft building an hypervisor directly into the operating system, VMware was at risk to lose control. So building the hypervisor directly into the hardware is a very smart move.

On the other hand, the explanation of the difference between an OS and an hypervisor was a bit belabored. When VMware states that “the RedHat console OS takes 2G of footprint”, it is, I guess, not memory (not on x86, at least). So that’s probably disk. But making the hypervisor a separate component is a bit like making the Linux kernel a separate component. As the GNU folks are fond of pointing out, a kernel by itself does not do much. And a Linux kernel on x86 is also in the order of a few megabytes, not a few gigabytes.

VMware co-founder predicts death of operating systems

According to this article, VMware co-founder Mendel Rosenblum predicts the death of the operating system. I once made a somewhat similar prediction, although at the time, I thought that it would disappear from view, not from existence.

Both positions are closer than it may appear at first. Rosenblum and I both predict the disappearance of the OS as somewhat irrelevant to the end user. Rosenblum thinks that it will become irrelevant because it no longer controls the hardware directly, its role being reduced to providing application programming interfaces to the applications. I actually agree with this point of view. Otherwise, I would not have initiated another virtual machine technology, HP Integrity VM, and still be working on it today. As a side note, when we started that project, we negotiated with VMware in general and Rosenblum in particular.

But I also stand by my other prediction, that operating systems also fade into irrelevance on the user side as well. This is not yet true for personal computer operating systems, but it is already true of the majority of operating systems we use today:

  • You probably don’t know what operating system the many electronics components in your car are running.
  • Similarly, the OS in your set-top box, in airplane entertainment systems, in your MP3 players and GPS devices are probably not very relevant to how you use these devices.
  • Finally, on a larger scale, the operating systems behind services like Google, the delivery of your e-mail or IP phones are also, I would bet, not part of your decision when choosing one service or another.

Virtualization benchmark

I just came across a very simple virtualization benchmark. This kind of benchmark shows some of the problems with virtualization: it usually works pretty well at limited load, but you pay the price at high load. This appears to be true of any virtualization technology today, even if it varies in the details.

Benchmarking often boils down to a petty fight. One get a true sense of just how reliable benchmarks can be by comparing the Xen and the VMware results for what appears to be the same benchmarks (but, oddly enough, with opposite conclusions).

Unfortunately, hardware assistance for virtualization, like Intel VT and VT-i technologies, does not seem to help much there (yet). It seems like it was designed primarily to make virtualization easier. As a result, we get for example Xen running Windows guests. But it does not really help with performance (summary).

VMware soon to virtualize of 3D graphics

This started with a video on Youtube purportedly showing accelerated 3D graphics in VMware Fusion. The news was quickly spotted by Macbidouille and many others, and later confirmed.

This is interesting news, for two reasons. The first one is that graphics performance has been a main bottleneck for desktop virtualization for a long time, and more importantly, a functional bottleneck. In particular, it excluded any kind of “modern” gaming in a virtual machine. Having solved that is really neat, both technically and as a way to make virtualization more mainstream.

The second reason is that, if indeed we are talking about a fully supported feature, I believe that it appeared on the Mac version first. Granted, it has been present in a limited form in Workstation 5.0, for a little more than one year, but the feature is not even advertised in VMware Workstation 5 datasheets. With the Intel Macs, VMware found itself another mass market for its workstation products, one with presumably a much better attach rate, and more importantly, a market where they were beaten to the gates by Parallels.

Once more, innovation is fueled by competition. Good for Macintosh users…

Longhorn virtualization

Microsoft recently published a video of their upcoming Longhorn virtualization. Interestingly, they demonstrate a feature that they claim no competitor can match, namely an 8-core virtual machine. I’m really curious to see the final product. Being in the field, I know exactly what kind of problem virtual machines run into with scalability.

A key problem is relatively simple to explain, and having explained it to many HP customers, I see no reason not to talk about it on this blog. When you schedule more than one virtual CPU, there are essentially two ways of doing it.

  • You can use something called gang scheduling. In that case, all virtual CPUs run at the same time. One major benefit is that if a virtual CPU is waiting for a lock, the virtual CPU that holds the lock is also running. So there is no major increase in lock contention, a key factor in scalability. The major drawback is that if you have an 8-way virtual machine with 1 CPU busy, it actually consumes 8 CPUs. That is not very good use of the CPUs.
  • An alternative is to schedule each virtual CPU individually. This is much more difficult to get right, in particular because of the lock contention issue outlined above. On the other hand, the benefit in terms of CPU utilization for virtual machines that use only some of their virtual CPUs are enormous.

So here is a test I’m interested in running on Longhorn virtualization when it becomes available: if I have two 4-way virtual machines on an 4-way hardware, and if I start one “spinner” process in each that counts as fast as possible, do I see:

  • 4 processors pegged at 100%, with the two spinners running at half their nominal rate, since they each get essentially 50% of one CPU? As far as I know, this is the behavior for VMware, and it’s characteristic of gang scheduling.
  • 2 processors pegged at 100% and 2 processors sitting mostly idle, with each spinner getting 100% of a CPU? This is the behavior of Integrity Virtual Machines.

The bottom line is: there is a serious trade off between scalability and efficient use of resources. If Microsoft managed to solve that problem, kudos to them.