Experimenting with filesystems

Since macOS Sierra introduced the new Apple filesystem (APFS), designed by no other than Dominic Gianpaolo (of BeOS fame), I thought I’d give it a try. So I created an APFS volume on an external 3G hard disk, copied a few pictures on it, and started playing with it.

APFS runs FSCK every time???

It mostly worked, although it comes with dire warnings, and you have to pass an insane option to the diskutil commands to get rid of it:

ddd@Marypuce Pictures> diskutil apfs list
WARNING:  You are using a pre-release version of the Apple File System called
          APFS which is meant for evaluation and development purposes only.
          Files stored on APFS volumes may not be accessible in future releases
          of macOS.  You should back up all of your data before using APFS and
          regularly back up data while using APFS, including before upgrading
          to future releases of macOS.

But things quickly went south as soon as I disconnected and reconnected the disk. It did not mount instantly, because an fsck process (File System Check) was running. Once this completed, I could see my disk, but it took minutes. So I tried ejecting the disk again. And sure enough, I had fsck running again next time I mounted the disk.

So I decided to try something else. I installed ZFS for OSX. I had only heard praises about ZFS being so great and this and that, so I thought it would be interesting.

ZFS can’t remount an external disk without some black magic?

Again, things went smoothly. Well, mostly. You have to activate some special option for the disk to “look like” HFS+ if you want Photos to be able to use it.

But again, things went south as soon as I disconnected the disk from one machine to put it in another one. I did something terribly wrong, you see: I ejected the disk on one Mac, and attached it to another. And I got this helpful little error message:

sudo zfs mount PhotosZFS
cannot open 'PhotosZFS': pool I/O is currently suspended

It looks like this is a standard issue with ZFS. You have to do some magic to export or import your ZFS pools. Something that I could understand. What I cannot understand is this response, from a guy with nickname ilovezfs:

I see in IRC that the disk was actually disconnected and reconnected while the pool was imported. Given that this is a single partition pool, not a raidz or mirror vdev, there is no reason to expect the pool to continue to function after the device has been disconnected and reconnected without exporting it first. At that point, your only choice is to reboot.

So now you have this supposedly enterprise-grade, secure, checksummed, snapshotting, almost magical filesystem, but unmounting a disk and reconnecting it to another computer is so verboten there is no reason to expect the pool to continue to function? Well, yes, there is: every other filesystem on earth does that right. And suggesting the fix is to reboot? Give me a break.

I’ll try ZFS again in 10 years, when it knows how to deal with external disks and does not loose 350GB of data on its first day of operation.


macOS Sierra Mail bug

This week, I started using macOS Sierra. Overall, like iOS 10, it’s another one of these Apple releases of late where what you gain is not extraordinarily compelling, but you discover as you go various things that you lost for no good reason.

Here is one I found today. Apparently, macOS Sierra cannot send mail with picture attachments. That seems like a pretty big one. (Update: It apparently depends on the machine, see at end).

If I send a picture with attachment from a machine running OSX 10.11, here is what I see in my Inbox in macOS Sierra:

Screen Shot 2016-10-06 at 12.38.09.png

So far, so good. Notice that the mail was sent to an Exchange server.

But now, let’s send an e-mail with a picture, this time from macOS Sierra. Here is what it looks like in my Inbox:

Screen Shot 2016-10-06 at 12.39.44.png

Now, something is obviously missing. In Outlook, I see some weird message telling me that the attachment was removed:

Screen Shot 2016-10-06 at 12.40.09.png

What is really curious is that it seems to depend on the server being used, not on the client. If I send the same kind of e-mail to a Google Mail account, then it looks like this, with a large empty box at the top, and then my picture attachment lost at the bottom (still not good, but at least, the picture is not entirely lost) :

Screen Shot 2016-10-06 at 12.42.46.png

I filed a bug report with Apple on this. It seems pretty major to me, and I really wonder how they could have missed it. Is there something special with my setup? Do you see the same thing?

Update: I tried sending an e-mail with attachment from another Mac also running macOS Sierra, and I have no problem at all… except that the aspect ratio of the picture is all wrong on Outlook. So the problem is not with every instance of macOS Sierra, which is good news for Apple and bad news for the Apple Mail developers (bugs that don’t always happen are harder to figure out).

LLDB is a piece of crap (update: maybe it’s clang) (update 2: it’s actually ccache)

I’ve been really trying to use LLDB for a while now. Not that I really want to, but Apple went out of its way to make sure I had little choice. Not only is LLDB the default on MacOSX now, but GDB is really hard to make work on that platform as well. Can you imagine you have to generate a digital signature?

The first thing I don’t like about LLDB is its totally painful command structure. The LLDB authors published a GDB-to-LLDB conversion map, which they probably think is helpful. But to me, all it shows is that LLDB commands are more complex and more verbose than their GDB counterparts, with no obvious way to infer the LLDB command from either GDB experience, or from any kind of logic.

But the thing I dislike the most is that LLDB plain does not work, even when used with Apple tools, in a number of situations that I happen to hit practically on a daily basis. For example, it appears to be consistently unable to set breakpoints by file name and line number with command-line options that are should be used frequently enough to just work.

Here is an example session that illustrates the problem.

ddd@Marypuce tmp> cat glop.cpp
#include <iostream>

int main()
std::cerr << "Hello World\n";
ddd@Marypuce tmp> c++ -c -g glop.cpp -mmacosx-version-min=10.6 -o glop.o
ddd@Marypuce tmp> c++ -g glop.o -mmacosx-version-min=10.6 -o glop
ddd@Marypuce tmp> lldb glop
Current executable set to 'glop' (x86_64).
(lldb) b glop.cpp:5
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
(lldb) ^D
ddd@Marypuce tmp> c++ -g glop.cpp -mmacosx-version-min=10.6 -o glop
ddd@Marypuce tmp> lldb glop
Current executable set to 'glop' (x86_64).
(lldb) b glop.cpp:5
Breakpoint 1: where = glop`main + 22 at glop.cpp:5, address = 0x0000000100000e56

[Update] I initially thought the -mmacosx-version-min=10.6 option was necessary for this problem to show up. But it’s also broken without it. I ran the c++ commands with -v to see what the difference was, and apparently, it’s many little things. So separate compilation with debug symbols just does not work. OK, maybe it’s more a compiler thing than a debug thing. So maybe LLDB is not the piece of crap there. Still, that’s where the problems shows up.

This annoying bug shows how fragile this new support is. By making LLDB the default, and integrating it relatively well within Xcode, Apple is trying to slowly boil a frog here. But if you use the tools in a non-standard way, you might be burnt.

[Update 2] I incorrectly blamed clang and lldb for a problem that is actually with ccache. I am running ccache version 3.1.9 from MacPorts. If I get it out of the way, everything is back to normal. I sent a bug report email to the ccache, clang and lldb mailing lists hoping this will help someone else.

Connecting Mavericks to a Freebox: Oh the pain!

I’m very frustrated. Today, I wasted basically two or three hours fighting unreliable software implementing one of the most basic features in the world of networking, namely file sharing. I ought to be simple, it used to work, but if my experience is representative, it’s complexly broken nowadays. Grenouille bouillie.

What I tried to do is not that complicated. We have a Freebox at home, it’s basically a DSL modem with many features, including the possibility to act as a NAS. So I connected my external drives, and tried to connect to them with my Macs.

It worked. I was happy. I started copying my files around. I noticed that it was very reactive. For example, I could eject a disk from the NAS user interface (a web GUI that is relatively well designed), and instantly, the file server would restart. I saw a message on the Macs saying that the server had shut down, and a couple of seconds later, I was back in business.

Then something happened, and everything stopped working at once.

It’s frustrating, because I know exactly what I did at that moment. I reformatted a disk with the NAS user interface. That disk was initially formatted as HFS+, which the NAS would expose as read only. So I reformatted it as Ext4.

And suddenly, the file server stopped working, even when I removed the disk in question. That leads me to believe I changed something else without knowing it. Maybe the GUI changed some configuration behind my back? I have no idea. All I know is that I spent the last two hours trying to understand how to revert my configuration so that I would be able to share disks again.

Symptoms: I connect to the NAS, and it refuses to show my disks. If I disable Mac sharing (AFP) and only enable PC sharing (SMB), then I can connect to the NAS, but somewhat unreliably.

I suspect the problem is on the Mavericks side of things, because connexions to another NAS I have at home are equally flaky. One minute it transfers dozen of files per second, the next it’s as if I was writing to a floppy disk. Transfers from other protocols (e.g. using a web browser) are fast and reliable, so I don’t think it’s a Wifi or network issue.

How annoying.


Paul Graham recommends doing things that don’t scale

As usual, Paul Graham writes an interesting piece about startups. He recommends doing things that don’t scale. Thinking like a big company is a sure way to fail. It’s a reassuring piece for the startup creator that I am, because at Taodyne, we are indeed in this phase where you do everything yourself and you’d need 48 hours a day to do the basics. Good to know that the solution to this problem is to keep working.

Connect this to the survivor bias. This is a very serious cognitive bias, which makes us look only at the survivors, at the planes who return from combat, at the successful entrepreneurs. Because we don’t look at the dead startups or planes that were shot down, we build our statistics on a biased sample. As a result, we make incorrect assumptions. For example, if the planes that return have mostly been shot in the tail and wings, you might deduce that this is where planes are being shot at, so that’s the parts you need to protect, when in reality what this proves is that these are the parts that don’t prevent a plane from returning when shot. Very useful.

Last interesting link of the day is the discussion about bullying on the Linux Kernel Mailing List (LKML). Sarah Sharp, a female Intel engineer, stands up to Linus Torvalds and asks him to stop verbal abuse. It’s an interesting conflict between very smart people. To me, there’s a lot of cultural difference at play here (one of the main topics of Grenouille Bouillie). For example, I learned from Torvalds what Management by Perkele means. On one side, it’s legitimate for Sarah to explain that she is offended by Linus’ behavior. On the other hand, it’s legitimate for Linus to keep doing what works.

Sarah reminds me of a very good friend of mine and former colleague, Karen Noel, a very sharp engineer who joined me on the HPVM project and taught me everything I forgot about VMS. Like Sarah, Karen was willing to stand up her ground while remaining very polite.

Apple backups and RAID are not reliable

You’d think that if you use RAID1 and multiple redundant, distributed backups with hourly backups, daily backups, etc, you’d be safe? Think again. If your backup software lies to you, you may not realize it until it’s way too late. If you RAID software does not deem it worthy to mention that a disk failed, what good is it?

Continue reading

Everything is broken and no one cares

Everything is broken and no one cares

This post from Dear Apple is just so true, and so clearly on topic for Grenouille Bouillie!

Have we reached the point in complexity where we can’t make good quality products anymore? Or is that some kind of strategic choice?

The original post is mostly about Apple products, but the same is true with Linux, with Android, with Windows.

Here is my own list of additional bugs, focusing on those that can easily be reproduced:

  1. Open a file named X in any of the new Apple applications, those without Save As. Open another file named Y. Save Y as X. Beachball. For every application. Worse yet, since applications often remember which windows were open, you get the beachball again when you reopen the application. It takes another force quit for the application to (fortunately) offer to not reopen the windows.
  2. A relatively well known one now: Type F i l e : / / / in practically any OSX application. Without the spaces. Hang or assert depending on your luck.
  3. Use a stereoscopic application like Tao Presentations (http://www.taodyne.com). Activate stereoscopy. Switch spaces or unplug an external monitor. Kernel panic or hang to be expected. Go tell to your customers that the kernel panic is Apple’s fault, not ours…
  4. If you backup over the network, set your computer to sleep after say 1 hour while on power. Change your disk enough that the backup takes more than one hour. Backup disk will come up as corrupt after a couple of days, and OSX will suggest you start a new one (and the cycle will repeat).
  5. Use the “Share” button. It takes forever to show up the window (like 2-3 seconds in general on my 2.6GHz quad-core i7 with 8GB of RAM). Since what I type generally begins with an uppercase letter, I usually prepare myself by having the finger on the shift key. But to that stupid animation framework, “shift” means “slow animation down so that Steve can demo it”. Steve is dead, but the “shift” behavior is still there.

I’ll keep updating this list as more come to mind. Add your own favorite bugs in the comments.

First update (Feb 13, 2013):

  1. Safari often fails to refresh various portions of the screen. Visible in particular when used in combination with Redmine. This used to be very annoying, but it has gotten much better in more recent updates of Safari.
  2. iTunes 11 no longer has Coverflow. It was a neat way to navigate in your music, which wasn’t even the default, why remove it?
  3. Valgrind on OSX 10.8 is completely broken. I have no idea what’s wrong, but it’s a pretty useful tool for developers, and Apple has nothing in its own development tools that is even remotely close.
  4. “Detect displays” is gone, both from the Monitors control panel and from the Monitors menu icon. Combine that with the fact that OSX 10.8, unlike its predecessors, sometimes totally fails to detect that you unplug a monitor. And you find yourself with windows stuck on a screen that is no longer there…
  5. That little Monitor menu icon used to be quite handy, e.g. to select the right resolution when connecting to an external projector for the first time. Now, it’s entirely useless. It only offers mirroring, fails to show up 90% of the time when there is a possibility to do mirroring, shows up when mirroring is impossible (e.g. after you disconnected the projector). It used to be working and useful, it’s now broken and useless. What’s not to love?
  6. Contacts used to have a way for me to format phone numbers the way I like. That’s gone. Now I have to accept the (broken) way it formats all phone numbers for me.
  7. I used to be able to sync between iPhone and Contacts relatively reliably. Now, if there’s a way to remove a phone number, I’ve not found it. Old numbers I removed keep reappearing at the next sync, ensuring that I never know which of the 2, 3 or 4 phone numbers I have is the not dead one.
  8. Still in Contacts, putting Facebook e-mail addresses as the first choice for my contacts? No thanks, it was heinous enough that Facebook replaced all genuine email addresses with @facebook.com aliases. But having that as the first one that pops up is really annoying.
  9. Now fixed, but in the early 10.8, connecting a wired network when I also had Wifi on the same network would not give me higher speed. It would just drop all network connectivity.

Updated February 28th after restoring a machine following a serious problem:

  1. Time machine restores are only good if your target disk is at least as big. But with Apple’s recent move to SSD, this may no longer be affordable to you. In my case, I’d like to squeeze 1TB of data into 512G. Time machine does not give me the level of fine-grained control I’d need to restore what I really need. So I need to try and do it manually, which is a real pain.
  2. Calendar sync is a real mess. Restoring calendars from a backup is worse.
  3. Spaces? Where are my good old spaces? Why is it I had spaces on the original machine, no longer have them, and find myself unable to say “I want 6 spaces” or to setup keyboard shortcuts for them as they used to be.