The ‘ZFS is too complex’ Argument

Jeff Harrell has a nice blog, The Shape of Days: Pretty darned close to perfection, and he has a nice love entry about Time Machine. One of the comments explains that Time Machine is bull and that Apple should have used ZFS instead. Something I completely agree with. Jeff then replies on that saying that ZFS is too complicated.

I seriously don’t buy that ‘ZFS is too complex’ argument. In the common case where there is for example one disk with one partition in a MacBook, ZFS is not any more complex than how disks are currently partitioned with HFS.

ZFS provides a LOT of cool stuff but you certainly don’t have to use it all. Sure, when you want to implement schemes like RAID with ZFS then things become more complicated, but the same is true in the current HFS situation. And a GUI can certainly hide all ‘scary’ ZFS details there. Just as is possible now. Fire up Disk Utility if you are that Pro user.

Anyway, back to the common case where there is one ZFS partition on an iMac or MacBook. Even there the advantages are huge: you get snapshots for FREE. Snapshots will provide exactly the same functionality as Time Machine, except that they are:

  • CORE FUNCTIONALITY – Snapshots are not a hack like TM currently is.
  • INSTANT – There is no need to wait until a snapshot is actually made.
  • DON’T REQUIRE AN INITIAL BACKUP – ZFS Snapshots don’t let you wait hours to do the first initial backup. There IS no initial backup.

  • DON’T REQUIRE AN EXTERNAL DISK – ZFS Snapshots work on a single partition. You don’t need an extra disk. Time machine is useless for people working on MacBooks. ZFS could turn that around.
  • TAKE NO SPACE – Snapshots take virtually no space. Until you actually start changing files that are contained in a snapshot.
  • CAN BE BROWSED – Just as in Time Machine you can browse your snapshots back in time, see what files changed, open old files, etc. No magic there.

Compare that to the current implementation of Time Machine. You can’t deny the advantages of ZFS.

I actually think Apple had planned to use ZFS for OS X 10.5 in general and to base Time Machine on it too. I also think that they completely underestimated the scope of the ZFS porting effort, and that they simply missed the Leopard release window.

Let’s hope they turn it around in a later release.

Advertisements

11 comments so far

  1. Jeff Harrell on

    Thanks for name-checking my blog. Unfortunately, it seems like you got just about everything you wrote here wrong.

    Time Machine is not a hack at all. It’s a daemon that depends on two features of Leopard: FSEvents and directory hard links.

    Saying ZFS snapshots are “instant” is disingenuous in the extreme. The only way ZFS snapshots are instant is if you’re only using one storage pool. That means either you’ve got just one block device, or you’ve got two or more block devices inside your computer, or you’re using an internal block device plus one or more external block devices. If you’ve got two or more internal devices, you’re using a Mac Pro, which makes you pretty rare among Mac owners. If you’re using just your internal drive, then you’re not backed up AT ALL. And if you’re using an internal drive plus an external drive, then your life is exactly like it would be with Time Machine: You’re backing up only when you’re plugged in. (In that scenario, you have to mirror your internal drive with your external drive, then run degraded when you’re not plugged in, then resilver when you plug in. Same basic idea as Time Machine, only with more complexity.)

    ZFS snapshots, of course, DO require an initial backup. How’s the data supposed to get onto your backup disk in the first place? Mirror and resilver. I haven’t time-tested the two operations, of course, but doing an initial resilver of a mirror set and doing an initial Time Machine backup are comparable tasks, so there wouldn’t be a ton of time saved there.

    If you’re not using an external disk, you’re not backing up. Period. Time Machine is obviously not “useless” for people who have MacBooks, since I’m typing this on a MacBook and I’ve been using Time Machine for more than a week now to great effect.

    Snapshots and Time Machine backups are almost exactly the same on the “take no space” front. That is, in both cases it’s not really true. Time Machine uses hard links to avoid copying unchanged data; ZFS uses block-level copy-on-write. The advantage of block-level copy-on-write is outweighed by the twin disadvantages of large block sizes and unavoidable block-level filesystem fragmentation.

    So, Mr. Anonymous Blogger Guy, not only CAN I deny the alleged advantages of ZFS, I DO deny them. There aren’t any. Not for single-user, small-form-factor computer systems like laptops and iMacs. Maybe someday, when the bugs are worked out. But not today.

  2. thelameleopard on

    “ZFS snapshots, of course, DO require an initial backup. How’s the data supposed to get onto your backup disk in the first place?”

    I seriously don’t care about a backup disk. I want Time Machine functionality without the extra disk. The need for an extra disk is a design flaw of Time Machine.

    Actually I want both. I want Time Machine functionality on my single disk in my MacBook, without having that extra disk with me allt he time. AND I want the ability to sync my disk to an extra disk.

    ZFS provides for both. Snapshots take are of the Time Machine functionality and if I *choose* to have a second disk as a hot copy then I can configure ZFS to do that too. Read how James Gosling does that with Solaris/ZFS on his laptop with a backup disk at home. It is awesome and I want that too.

  3. Tom Bridge on

    For anything to be really a backup, it absolutely, positively has to rest on media that’s NOT the original drive. Media still fails in this day and age (I know, it happened to new clients recently) and thus to be a backup, it’s gotta be on a second disk.

    And you CAN have Time Machine functionality without hauling a disk around, it’s how I work now. I plug in over night, or when I am working from the house, and then when I’m out on the town, I fly without a net. When I’m home, I plugin, and wham, TimeMachine back in effect. I wouldn’t get hourlies when I’m not hooked up, but that’s all James Gosling describes. Oh, and he can’t sleep his laptop while it’s running, so I guess there are tradeoffs.

  4. todd on

    “Time Machine functionality without the extra disk” is not Time Machine functionality. Time Machine is a backup tool and there’s no backup without the external copy. Creating browse-able versions of a file (which is what you want and what Time Machine and ZFS give) on one disk does not protect you from disk failure and is NOT a backup.

  5. Dmitri Trembovetski on

    “Creating browse-able versions of a file (which is what you want and what Time Machine and ZFS give) on one disk does not protect you from disk failure and is NOT a backup.”

    Not being able to back up at all when you are not connected to an external drive is NOT a backup either.

    You worked on a presentation while on a flight, and accidentally lost it. How TM will help you here? You could have made a ZFS snapshot (rather, it should have been done for you automatically by the system – snapshots are free) and restored it in a blink.

    Dmitri

  6. Paul Austin on

    In my view there are three types of backup.

    1. Saving a version of a file so that you can roll back any edits you make to the file. This is what you get with a ZFS snapshot. So if you accidentally delete a file or make some changes you don’t like you can get the previous version back. Subversion and SVN also allow this kind of functionality. This type of backup can be done on the same disk.

    2. Making a redundant copy of files on a remote disk to protect against disasters such as fire etc.

    3. Mirroring disks or using RAID 5 to allow for failure of a single hard drive allowing the computer to keep functioning without loosing data.

    In my view all of these are applicable to most systems.

  7. Idetrorce on

    very interesting, but I don’t agree with you
    Idetrorce

  8. thelameleopard on

    Idetrorce, care to explain in more detail?

  9. Martin on

    One point which people seem to have missed here is that, regardless of the technical merits or demerits of each approach, Apple must also provide a smooth, seamless update path for the installed base.

    I can’t imagine writing a reliable tool to in-place convert a near-full HFS+ partition to a ZFS one is going to be much fun even if it is technically possible. Realistically, to upgrade to ZFS in the future will likely require a filesystem wipe and reformat. That gets a lot more seamless if the vast majority of users already have all of their files reliably backed up to external media. Otherwise you’re going to have to tell them to go copy all of their files to external media as part of the upgrade process, which turns a simple, short upgrade process into a scary one for the average user.

    This really mandates that Apple first roll out something like the current release of Time Machine, which can be incrementally and non-destructively applied without disrupting anyone’s work or play. Get the majority of the user base happy with Time Machine in this form, and you have a lot more options for changing your core OS filesystem in a future upgrade…

  10. Slashdot on

    All you noobs saying ZFS sucks need to go and do your homework. The average user doesn’t care what the file system is and the fact is ZFS provides better performance than HFS which is why the average user should have the option of using ZFS on a Mac OS X install.

  11. Blake Irvin on

    Quoting Jeff Harrell:

    “Snapshots and Time Machine backups are almost exactly the same on the “take no space” front. That is, in both cases it’s not really true. Time Machine uses hard links to avoid copying unchanged data; ZFS uses block-level copy-on-write. The advantage of block-level copy-on-write is outweighed by the twin disadvantages of large block sizes and unavoidable block-level filesystem fragmentation.”

    This is not totally accurate. ZFS snapshots take into account only differing blocks. If a photo’s title changes, Time Machine must make a new copy of the whole file, ZFS writes only the blocks needed for the filename, and is therefore much more efficient than Time Machine (especially when the changes are small and the files are big):

    http://en.wikipedia.org/wiki/ZFS#Snapshots_and_clones

    Regarding fragmentation, I don’t think ZFS should fragment much more than HFS+ already does:

    http://en.wikipedia.org/wiki/Delayed_allocation


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: