Understanding ZFS: Compression
06 Nov '08 - 09:31 by benrOne of the most appealing features ZFS offers is built in compression capabilities. The tradeoffs are self evident, consume additional CPU but conserve disk space. If your running an OLTP database then compression probly isn't for you, however if you are doing bulk data archiving this could be a huge win.
ZFS is built with the realization that in modern systems we typically have large amounts of memory and CPU available, and we should be provided with the means to put those resources to work. Contrast this with the traditional logic that compression slows things down, because we stop and compress the data before flushing it out to disk, which takes time. Consider that in some situations, you may have significantly faster CPU and Memory than you have IO throughput, in which case it may in fact be faster to read and write compressed data because your reducing the quanity of IO through the channel! Compression isn't just about saving disk space... keep an open mind.
The first important point about ZFS Compression is that its granular. Within ZFS we create datasets (some people call them "nested filesystems", but I find that confusing terminology), each of which has inherited properties. One of those properties is compression. Therefore, if we create a "home" dataset which mounts to "/home", and then create a "home/user" dataset for each user, we can do interesting things, such as apply per-user quotas (disk usage limits) or reservations (set aside space) or, in this context, enable, disable or specify differing types of compression. Some users may want compression, others may not, or you may wish all to use it by default. ZFS gives us a wide range of flexible options. Most importantly, if we change our mind at some point we can change the setting and all new data is compressed, the old uncompressed data is still used as expected. This means that changes are no disruptive, however this does mean that if you really want to later conserve all the disk you can you'd need to enable compression and then slosh all the data off and then back in.
So, how do we enable compression? Simple, use the zfs set compression=on some/dataset command. If we then "get all" properties from a dataset we'll see some interesting information. Here is an example (pruned for length) of my home directory:
root@quadra ~$ zfs get all quadra/home/benr NAME PROPERTY VALUE SOURCE quadra/home/benr type filesystem - quadra/home/benr creation Thu Oct 9 11:33 2008 - quadra/home/benr used 122G - quadra/home/benr available 432G - quadra/home/benr referenced 122G - quadra/home/benr compressratio 1.19x - quadra/home/benr mounted yes - quadra/home/benr quota none default quadra/home/benr reservation none default quadra/home/benr recordsize 128K default quadra/home/benr mountpoint /quadra/home/benr default quadra/home/benr checksum on default quadra/home/benr compression on inherited from quadra/home ...
Here we see that compression is "on", and was inherited automatically from its parent dataset "quadra/home". We can also see the compression ratio above: 1.19x.
But what are our options? Just on or off? Many ZFS properties have simplistic "defaults", in this case "on" means that we use the "lzjb" compression algorithm. We can instead specify the exact algorithm. Current, in fairly modern releases of Nevada/OpenSolaris we have available the default LZJB (a lossless compression algorithm created by Jeff Bonwick, which is extremely fast) and gzip at compression levels 0-9. If you set "compression=gzip" you'll get GZIP level 6 compression, however you can explicitly "set compression=gzip-9". More compression algorithms may be added in the future. (The source is out there, feel free to give us another!)
But how can you see the effect? Did you know the "du" command will show you the on-disk (compressed) size of a file? Lets experiment!
root@quadra ~$ zfs create quadra/test root@quadra ~$ zfs get compression quadra/test NAME PROPERTY VALUE SOURCE quadra/test compression off default
Ok, we have a dataset to play with. I've downloaded Moby Dick and combined into a single text file.
root@quadra test$ ls -lh moby-dick.txt
-rw-r--r-- 1 root root 1.8M Nov 6 01:38 moby-dick.txt
root@quadra test$ du -h moby-dick.txt
1.8M moby-dick.txt
root@quadra test$ head -4 moby-dick.txt
.. < chapter I 2 LOOMINGS >
Call me Ishmael. Some years ago--never mind how
long precisely --having little or no money in my purse, and nothing particular
Alright, so here is Moby Disk in text, weighing in at 1.8M uncompressed. Lets not enable compression (LZJB), copy the file and see how much benefit we get:
root@quadra test$ zfs set compression=on quadra/test root@quadra test$ cp moby-dick.txt moby-dick-lzjb.txt root@quadra test$ sync root@quadra test$ ls -lh total 3.5M -rw-r--r-- 1 root root 1.8M Nov 6 01:40 moby-dick-lzjb.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:38 moby-dick.txt root@quadra test$ du -ah 1.7M ./moby-dick-lzjb.txt 1.8M ./moby-dick.txt 3.5M .
Nice, we're saving some space. Now lets repeat with gzip.
root@quadra test$ zfs set compression=gzip quadra/test root@quadra test$ cp moby-dick.txt moby-dick-gzip.txt root@quadra test$ ls -lh total 4.6M -rw-r--r-- 1 root root 1.8M Nov 6 01:44 moby-dick-gzip.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:40 moby-dick-lzjb.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:38 moby-dick.txt root@quadra test$ du -ah 1.7M ./moby-dick-lzjb.txt 1.8M ./moby-dick.txt 1.1M ./moby-dick-gzip.txt 4.6M .
Ahhhh. Nice gain there. Remember that this is gzip-6 really, lets crank it up to gzip-9!
root@quadra test$ zfs set compression=gzip-9 quadra/test root@quadra test$ cp moby-dick.txt moby-dick-gzip9.txt root@quadra test$ ls -lh total 4.6M -rw-r--r-- 1 root root 1.8M Nov 6 01:44 moby-dick-gzip.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:46 moby-dick-gzip9.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:40 moby-dick-lzjb.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:38 moby-dick.txt root@quadra test$ du -ah 1.7M ./moby-dick-lzjb.txt 1.8M ./moby-dick.txt 1.1M ./moby-dick-gzip.txt 512 ./moby-dick-gzip9.txt 4.6M .
Wow! Thats savings. Just to put this in context, I'll test gzip'ing the file like your used to (using tmpfs, not zfs):
root@quadra test$ cd /tmp root@quadra tmp$ ls -alh moby-dick.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:47 moby-dick.txt root@quadra tmp$ gzip moby-dick.txt root@quadra tmp$ ls -alh moby-dick.txt.gz -rw-r--r-- 1 root root 1.1M Nov 6 01:47 moby-dick.txt.gz
And so we see here that just gzip'ing the file matches the compression I got with gzip (gzip-6) enabled.
But before you get too excited, remember that this is consuming system CPU time. The more compression you do, the more CPU you'll consume. If this is a dedicated storage system your working on, then consuming a ton of CPU for compression may well be worth it (many appliances have fast CPU's for just this reason), however if your running critical apps and CPU really counts, then notch it down or even turn it off. I highly recommend you dry-run your application workload and then load test it hard to see whether or not that extra CPU will be a problem. Whenever possible, try to determine these things before you deploy, not after.
To follow up that idea, remember that we can set differing compression levels on different datasets. You may want to put your application data on an uncompressed dataset, but store less commonly used data or backups on a separate dataset where you've cranked compression up. Get creative!
ZFS is an amazing technology and compression is certainly one of its big attractions for the common user. Workstation always low on disk? Compression to the rescue, no stupid FUSE or loopback tricks required. :)
A Word Of Warning
At this point I do want to warn you of something. Notice that du displays actual disk consumption, not true file size. Now consider the way in which most admins actually use the command... to total up cumulative file sizes. On a typical file system, "du -sh ." will nicely total up all the files, which would be the same as if I tar'ed up the files and looked at the tarball's filesize. When using compression you can not use "du" in this way because the files are larger than the actual disk usage. So you get into potentially confusing situations like this:
root@quadra test$ ls -alh total 5.6M drwxr-xr-x 2 root root 6 Nov 6 01:46 . drwxr-xr-x 8 root root 8 Nov 6 01:33 .. -rw-r--r-- 1 root root 1.8M Nov 6 01:44 moby-dick-gzip.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:46 moby-dick-gzip9.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:40 moby-dick-lzjb.txt -rw-r--r-- 1 root root 1.8M Nov 6 01:38 moby-dick.txt root@quadra test$ du -h . 5.6M . root@quadra test$ tar cfv md.tar moby-dick* moby-dick-gzip.txt moby-dick-gzip9.txt moby-dick-lzjb.txt moby-dick.txt root@quadra test$ ls -lh md.tar -rw-r--r-- 1 root root 7.0M Nov 6 02:11 md.tar
In the real world, this could come to you as a shock if you wanted to rsync a bunch of data, totalled it up using "du" to estimate the bits that need to move, and then got nervous when you moved way more bits than you initially expected because your forgot to take compression into the equitation. So hopefully some of you can learn here not just how ZFS works, but appreciate "du" in a new way as well. :)
wow, this is a great post! I’d heard a lot of good things about ZFS, and the more I hear, the more I hope Sun releases it on Linux. I have my doubts about how soon that will happen, though.
If it weren’t for the lack of a Solaris brain trust here, I’d be tempted to switch. I just don’t have the experience with it. I just keep getting more and more incentive to work with it.
Matt Simmons (Email) (URL) - 06 November '08 - 12:46
What the freak at logo of blog?aZ - 06 November '08 - 14:47
Is there any way to determine which files have been compressed on the filesystem and which have not? I can see how you can determine the properties of the ZFS filesystems but I don’t see a way of determining the properties associated with a particular object (file, directory, etc.). If you could tell that a file was or was not created with compression, you could write a script which would only “slosh off and back on” files which needed their compression status changed. Similar considerations might apply when you need to adjust other properties (encryption, copies, etc.) as well.Rand Huntzinger - 06 November '08 - 15:08
Good post as usual Ben!Dave Pickens (Email) (URL) - 06 November '08 - 16:46
Ben,Thanks for such a comprehensive post. I’ve mainly been using ZFS in-house, using it in anger for non-critical stuff so I feel more comfortable when I jump to deploying business critical apps on it.
But your post lead me to thinking – I’ve also been considering throwing together a fat-ass drive array + mobo for HTPC storage accessed via iSCSI [that way I can have a huge quantity of video and audio content, and the noisy box can get stashed somewhere far away from the home theatre. Do you think that ZFS+compression would cope with that – provided the CPU had enough capacity? From the above, I can’t see why not…thus giving me even more “space” for my data!
Also – do you think (Open)Solaris needs to add an legacy option to du to “help” those of us sysadmins who regularly jump between say Solaris, Ubuntu and Mac OS X? And might just forget that we’re not necessarily looking at the actual total file size, which as you say is what we’re accustomed to?
Mark Glossop - 06 November '08 - 16:54
Mark,I’m pretty sure that’s and ideal usage because the CPU would otherwise be pretty idle. It’s where the processing and the storage occur in the same place that you’d get into trouble, such as if you were decoding/encoding data directly on the system with ZFS compression. It all depends on whether you’re I/O bound or CPU bound.
I’m glad to hear about the options to compression. I’ve previously only been using on/off, but I’m interested in trying to get some more space out of backup boxes for which space is much more critical than storage.
One question, how granular is the compression, really? That is, is compression less of an issue with large sequential accesses over small random accesses? Is the compression over each block?
Drew (Email) - 07 November '08 - 18:15
Hi there!My first post at this great blog!
I wanna show u my dayly updated blog: Black Amateur Fuck Video
Have a nice day!
BB!
P.S. if you don’t want to see this message please write me to no.ads08@gmail.com with subject “NO ADS” and URL of your forum
Thank you for cooperation!
KurmanAhlabm (Email) (URL) - 08 November '08 - 19:28
I have a hypothesis that compression performance in filesystems isn’t intuitive, and it’s always the case that if one tries it, the performance will “feel” better than one expects.There are two reasons for this:
1. It is natural to think of compression in simple space/time tradeoff terms. However, every block of data that doesn’t have to be transferred also saves the CPU time required to manage the transfer. Furthermore, in terms of CPU time accounting, that time is often attributed to general system overhead costs and does not get credited to a specific user process, so accounting data will often show the additional CPU cost, but not the related CPU savings.
2. It is natural to think of files which aren’t human readable as random data, and hence not compressible or less compressible than they actually are.
Dave Hamaker (Email) - 04 December '08 - 21:19
>When using compression you can not use “du” in this way because the>files are larger than the actual disk usage.
i don`t have solaris around here at home, but why not using gnu`s “du” ? that one has ”—apparent-size” as an option and iirc, it gave me the real size of compressed files on zfs-fuse.
here`s the manpage: [[http://www.gnu.org/software/coreutils/..]]
roland (Email) - 01 January '09 - 18:03
Hey, uh, I downloaded moby.zip from the URL you gave, concatenated all the files inside except for moby.0 (which isn’t in your head output) and README, but the resulting file was only 1202893 bytes. Where are your other 600k coming from?Also, what’s wrong with LZJB that it could only compress that file by 7% or so? I don’t have an LZJB implementation handy, but LZF (which is usually lightning fast but for some reason takes 100ms on this file) compresses it by 40% down to 737181 bytes, and gzip compresses it by 53% to 59% (569776 to 487367 bytes, -1 and -9), not the mere 49% you got. And what’s up with gzip compressing the file to 512 bytes the second time?
Kragen Javier Sitaker (Email) (URL) - 24 January '09 - 06:02
Very good opinions. I am sure, will be useful for everyone .thank you very much for informing to me .good work.i will try this just after my 70-290 for managing and maintaining a Microsoft windows server 2003 environment about which I am confident to pass in first attempt because I have already pass 70-649 which is for Microsoft certification as well as your page concern I must say that you have done a great job I must return on your page to read you more.TestKing (Email) (URL) - 17 March '09 - 12:41
I downloaded moby.zip from the URL you gave, concatenated all the files inside except for moby.0 (which isn’t in your head output) and README, but the resulting file was only 1202893 bytes. Where are your other 600k coming from?Chris (Email) (URL) - 10 May '09 - 07:08
I downloaded moby.zip from the URL you gave, concatenated all the files inside except for moby.0 (which isn’t in your head output) and README, but the resulting file was only 1202893 bytes. Where are your other 600k coming from?Same ask.
rJr (Email) (URL) - 07 July '09 - 07:15
It was a very nice idea! Just wanna say thank you for the information you have shared. Just continue writing this kind of post. I will be your loyal reader. Thanks again.Christian louboutin shoes (Email) (URL) - 27 October '09 - 00:43
Thank you very much!cheap links of london (Email) (URL) - 07 November '09 - 00:50
Thanks!!Tiffany jewellery (Email) (URL) - 14 November '09 - 02:38
Great post! Hope to be better. Better means more features.good post,I think so!
Thanks for your information, i have read it, very good!
Bing is a really overlord!! support Bing~~
This is great news. Best of luck for the future and keep up the good work.
links of london (Email) (URL) - 17 November '09 - 03:18
Just one question: how to add your blog into my rrs reader, thanks so much.christian louboutin (Email) (URL) - 03 January '10 - 07:06
links of london links of london bloglinks of london (Email) (URL) - 12 January '10 - 11:22
Good post! Thanks you for your information! China Wholesale Wholesale China Wholesalers Wholesale Game Accessories Wholesale Iphone Accessories Video Game Accessories Wholesale Wholesale Wii Accessories Wholesale Xbox 360 Accessories Wholesale Xbox 360 Games Wholesale Video Games Cheap Video Games Cheap Ps3 Games Cheap Xbox 360 Games Wholesale Computers Wholesale Laptop Computers Wholesale Laptops Discount Computers Cheap Computers Wholesale Iphones Wholesale Iphone Wholesale Iphones 3g Hiphones Wholesale Hiphone Wholesale Hiphones Wholesale Nokia Wholesale Nokia 8800 Wholesale Nokia n97 wholesale blackberry wholesale blackberry phones wholesale blackberry 9700 wholesale blackberry 9600 wholesale blackberry 9500raging bull (Email) (URL) - 15 January '10 - 06:41
thankskamagra (Email) (URL) - 21 January '10 - 01:21
good read thanksM65 Jacket (Email) (URL) - 21 January '10 - 01:22
many thanksviagra cialis (Email) (URL) - 21 January '10 - 01:55
I have enjoyed reading, I will make sure and bookmark this page and be back to follow you.