When running a virtual machine, the virtual environment has to present devices to the guest OS – disks and network being the main two (plus video, USB, timers, and others). Effectively, this is the hardware that the VM guest sees.
Now, if the guest is to be kept entirely ignorant of the fact that it is virtualised, this means that the host must emulate some kind of real hardware. This is quite slow (particularly for network devices), and is the major cause of reduced performance in virtual machines.
However, if you are willing to let the guest OS know that it's in a virtual environment, it is possible to avoid the overheads of emulating much of the real hardware, and use a far more direct path to handle devices inside the VM. This approach is called paravirtualisation. In this case, the guest OS needs a particular driver installed which talks to the paravirtual device. Under Linux, this interface has been standardised, and is referred to as the "virtio" interface.
Grab the latest 0.9.1 qemu sources, and the virtio device patches. The original patches are available from the qemu-devel mailing list, in four emails: intro, mail 1, mail 2, mail 3, but they don't apply cleanly to the 0.9.1 source. There are updated patches on my site at file 1, file 2, file 3 if you don't feel like porting the patches yourself.
Apply the patches, build and install qemu as normal.
In order to get the necessary support in the guest kernel, you will need the latest pre-release kernel. You can get this either from Linus's git repository, or (as of this weekend) as 2.6.25-rc1 or later from the kernel.org repository. If you are running Ubuntu in the guest, I'm told that the Hardy Alpha 4 kernel has the necessary patches in it.
The configuration options you need turned on are:
PARAVIRT_GUEST: -> Processor type and features -> Paravirtualized guest support LGUEST_GUEST: -> Processor type and features -> Paravirtualized guest support -> Lguest guest support VIRTIO_PCI: -> Virtualization (VIRTUALIZATION [=y]) -> PCI driver for virtio devices VIRTIO_BLK: -> Device Drivers -> Block devices (BLK_DEV [=y]) -> Virtio block driver VIRTIO_NET: -> Device Drivers -> Network device support (NETDEVICES [=y]) -> Virtio network driver
I suggest the latter three devices are built as modules.
There is also a VIRTIO_BALLOON driver for dealing with dynamic memory allocation, and a VIRTIO_CONSOLE driver. I don't know if either of these has support in qemu yet – I suspect not.
To start qemu with a virtio block device, you will need at minimum the following option:
This will map the given block device or file on the host as a virtio-blk device on the guest. You can specify more -drive options to add more virtio devices.
In the guest OS, you will need the modules virtio-blk and virtio-pci loaded. This should then create devices /dev/vda, /dev/vdb, etc. – one for each virtio -drive option you specified on the command line to qemu. These devices can be treated just like any other hard disk – they can be partitioned, formatted, and filesystems mounted on them. Note that at the moment, there doesn't seem to be any support for booting off them, so you will need at least one non-virtio device in the VM.
In getting this working, I discovered that the LVM tools don't know about virtio block devices, and thus ignore them. There is a simple one-line patch which teaches the LVM tools about virtio devices. With that patch included, it all works nicely.
To run a virtio-type network card, you will need the virtio-net and virtio-pci modules loaded in the guest kernel. You can then create a virtio network device in the guest system with the qemu option:
Of course, you will also need a -net tap, -net user or -net socket option to create the paired device on the host side, as normal. Your network device will then appear as eth0. I haven't managed to get my virtio network device working properly yet, so your mileage may vary with this one. My problem is that it's simply not talking to the outside world – the device exists and can be manipulated with ifconfig, and it responds to pings, but no data seems to reach the host network.
I've not tested performance on the network side (because I can't get it to work), but my tests on the block device show:
real 53m56.731s user 23m38.001s sys 26m33.772s
real 33m21.242s user 10m52.030s sys 11m32.950s
I'd say that's a win.
Note that this is a completely unscientific test of a single workload under barely-controlled conditions. The results above may or may not be indicative of actual performance gains for other workloads. Your mileage may vary. Contents may have settled in transit. No user-serviceable parts. The value of your investment may go down as well as up.