KVM/Qemu and caching of I/O

I like to live “on the edge”. At least technologically speaking.

As a consequence, in my environment, I’ve got a couple of KVM guests that are running Fedora 12 with Red Hat Cluster v3.0.6 installed. That’s not really “living on the edge”. The “living on the edge” part of that configuration is that the two guests share a clustered file system. This clustered file system is hosted on a DRDB replicated volume between two standard internal SATA drives hosted on two different KVM host systems. And these host systems are, in turn their own Fedora 12 based Red Hat Cluster.

Obviously, there are plenty of opportunities for data to go “missing” (get corrupted/get lost/disappear/etc) in a configuration like this. And I thought I’d been able to eliminate them all.

That was what I thought, until I ran one of the KVM guests on one of the hosts, and the other on the other. My GFS2 file system wasn’t impressed! And I was stumped. DRBD had been configured with synchronous replication (let’s not talk about the performance impact of that decision, shall we…?) but obviously the data wasn’t being committed simultaneously to both drives[1].

I now suspect that’s happening because the KVM hosts were caching the data on the guests behalf. Could be a very spiffy performance boost[2] but causes all sorts of problems for my clustered applications that rely on the data in the file system being where it’s supposed to be.

So, I had to dig around a little and discovered  that Qemu/KVM/libvirt actually supports setting the caching properties for the ‘physical’ devices backing its virtual hard drives (i.e. the hard drives or container files exported to the guest as “disks”). And it’s – if you’re using the CLI interfaces for managing KVM, libvirtd & virsh – fairly easy to set it to what you want/need it to be.

The caching properties you can set are:

  • writeback
  • writethrough
  • none
  • default

Unfortunately, I’ve not been able to locate some way to set this while creating the guest with virt-manager. However, virt-install does let you set it, and if the guest is inactive (i.e. not running), you can set it by editing the <driver> tag.

For example:

<disk type='block' device='disk'>
   <driver name='qemu' cache='none'/>
   <source dev='/dev/mapper/sharedVG01-www--local'/>
   <target dev='vde' bus='virtio'/>
</disk>
NOTE: Early versions of libvirtd may not support the <driver cache=”> nomenclature. I’m using 0.7.5 in my environment, but I believe any recent (0.7 and later, for sure) of libvirtd include support for this. To check your libvirt version, issue the command:
# virsh version

Apropos:


[1] = I know, I know. A DRBD mirror set up to use “protocol C” doesn’t, technically, commit the data simultaneously to both devices. It only “looks” like that because the write() operation does not return success until the data has been successfully written on the “remote” device as well as the local one.

[2] = It is, actually. As an example, the reason why the likes of Xen, KVM, etc have been able to post IO benchmarks that are more than 100% the performance of the underlying hardware is because the host environment caches the data on the guests behalf. Looks good on benchmarks. Not so much if your host fails before the data have been flushed from the host cache onto the physical disk devices. Applications tend to get cranky when data they “know” was committed to persistent storage is missing…

There are no comments yet. Be the first and leave a response!

Leave a Reply

Wanting to leave an <em>phasis on your comment?

Trackback URL http://linux.sjolshagen.net/2010/01/10/kvmqemu-and-caching-of-io/trackback/