I’d like to share with you how i configured my host-os to support VMs with secure and flexible storage.
My setup is based on Ubuntu Lucid Lynx, 64bit, running on an Intel Core i7 with 8GB of memory. Storage is provided through a, four 1TB Western Digital disk, RAID-5 array with LVM on top. Each VM also runs with LVM.

Host OS

My host-os is a debootstrap’ed install of Ubuntu. The idea being that my host-os doesn’t need to run anything else than the essentials needed to provide networking, storage and VMs. Because i work with the GUI of virt-manager, i had to install some X11-packages and enable SSH X11-forwarding to pull the GUI to wherever i am.

Networking on host-os

My host-os has a dedicated IP. All my VMs have their own IP in a /29 IP-netblock assigned to me. This /29 is routed to my host-os’ dedicated IP by my ISP but i have not configured any of these IPs on my host-os. The host-os only has a route for the network on it’s br0 bridge interface so it knows this traffic is ‘local’ to it:

# The primary network interface
auto eth0
iface eth0 inet manual

# The primary bridge
auto br0
iface br0 inet static
    address 213.154.229.18
    netmask 255.255.255.192
    gateway 213.154.229.1
    bridge_ports eth0
    bridge_stp off
    bridge_fd 0
    bridge_maxwait 0
    up /sbin/ip -6 addr add 2001:7b8:3:47:213:154:229:18/64 dev br0
    up /sbin/ip ro ad 213.154.236.176/29 dev br0
    up echo 1 > /proc/sys/net/ipv4/conf/br0/forwarding

IPv6 is autoconfigured by router advertisements on the network but because MAC-addresses in virtualised setups tend to change i add a static IPv6 address to each server. The static IPv6 address is in the same network as the EUI64 autoconfigured address, so this works routing-wise.

And since my host-os does not function as a gateway for the VMs, i can now use all of the eight IPs routed to my host-os IP. None are lost to network, broadcast and gateway addresses.

Networking on VMs

To use the eight IPs from the /29 and not route them myself, i had to manually configure routing in the VMs. I use ‘ip’ from the iproute package. It beats ifconfig by a long run:

# The primary network interface
auto eth0
iface eth0 inet manual
	up /sbin/ip link set eth0 up
	up /sbin/ip addr add 213.154.236.176/32 dev eth0
	up /sbin/ip -6 addr add 2001:7b8:3:47:213:154:236:176/64 dev eth0
	up /sbin/ip ro ad 213.154.236.176/29 dev eth0
	up /sbin/ip ro ad 213.154.229.0/26 dev eth0
	up /sbin/ip ro ad default via 213.154.229.1 dev eth0
	up /usr/sbin/arp -s 213.154.229.1 00:00:5e:00:01:01
	down /usr/sbin/arp -d 213.154.229.1
	down /sbin/ip ro flu dev eth0
	down /sbin/ip ad flu dev eth0
	down /sbin/ip link set eth0 down

So, i up the link, configure the IP-address as a /32, add a route for the network this VM is in and add a route for the network my host-os’ gateway is in. This last step is essential since the default gateway is in another network and we need to tell the VM how to reach it.

Then there’s the only weak-spot in this design. I manually add a static ARP-entry for the default gateway. Somehow ARP doesn’t do it’s thing, so the VM in 213.154.236.176/29 doesn’t get the MAC-address for the gateway automatically. I have not looked into this yet.

IPv6 on VMs is autoconfigured by router advertisements on the host-os network. They travel across the bridge-interface to the VMs that are connected with it. This means it is once again really easy to configure IPv6 on the host. Just add a static IP and use the route info from router advertisements.

Storage on host-os

My host has 4 disks, all partitioned as follows:

   Device Boot  Start         End      Blocks   Id  System
/dev/sda1   *       1        1216     9764864   fd  Linux raid autodetect
/dev/sda2        1216        1703     3906560   fd  Linux raid autodetect
/dev/sda3        1703      121602   963089408   fd  Linux raid autodetect

These twelve partitions are used in mdadm software-raid:

md0 : active raid1 sdd1[3] sdb1[1] sdc1[2] sda1[0]
md1 : active raid1 sda2[0] sdc2[1]
md2 : active raid1 sdd2[1] sdb2[0]
md3 : active raid5 sda3[0] sdd3[3] sdb3[1] sdc3[2]

The function of each of these RAID-arrays is explained below:

  • /dev/md0
    Is used as the ~10GB root-fs for my host-os.
  • /dev/md1 and /dev/md2
    Are both used as swap-partitions. I intentionally made this two arrays over two disks to double the swap-space it would offer.
  • /dev/md3
    This ~3TB device is used as a PV (physical device) in the VG (volume group) ‘vms’ which provides LVs (logical volumes) for my VMs to use as ‘raw block device’-storage.

A little more on the LVM-setup. The complete /dev/md3 device is one PV for VG ‘vms’. Each VM has just one LV in this VG which functions as the ‘raw disk device’ for the VM:

root@vm:~# lvscan
  ACTIVE            '/dev/vms/vm_dot' [500.00 GiB] inherit
  ACTIVE            '/dev/vms/vm_pound' [500.00 GiB] inherit
  ACTIVE            '/dev/vms/vm_services' [50.00 GiB] inherit

Storage on VMs

My VMs, if Linux, run LVM on their ‘drives’ too. This way i can resize partitions inside the VM and the entire VM’s ‘drive’ too!

If i run ‘partprobe‘ in my host-os, i can see the partitions of my VMs in my host os:

root@vm:~# partprobe
root@vm:~# ls -l /dev/mapper
[ .. ]
brw-rw---- 1 root disk 251,  0 2010-07-18 21:50 vms-vm_dot
brw-rw---- 1 root disk 251, 16 2010-07-18 21:50 vms-vm_dotp1
brw-rw---- 1 root disk 251, 17 2010-07-18 21:50 vms-vm_dotp2
brw-rw---- 1 root disk 251, 18 2010-07-18 21:50 vms-vm_dotp5
root@vm:/dev# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "dot" using metadata type lvm2
  Found volume group "pound" using metadata type lvm2
  Found volume group "vms" using metadata type lvm2

So as you can see, LVM on the host-os sees the VGs and LVs created inside the VMs. If i had used encryption in the VM-installation, the VM-disks would not be available to the host-os, ofcourse.

KSM Kernel Samepage Merging

A really nice feature in 2.6.32+ kernels is the Kernel Samepage Merging feature. Basically this maps memoryblocks that are identical to another memoryblock so that only one copy of data is kept in memory for both allocations. It means VMs share memory that has the same content and this immensely decreases the memorypressure of VMs on the host-os. More information on the subject can be found on the internet.

My setup shares ~400.000 pages on average. Each page is 4096 bytes large, so this makes for 1.52GB of shared memory. Which is just marvelous.

Concluding

I migrated from a ’single host setup’ to this ‘virtualized setup’ after a disk-crash with my old system. I had good backups and was able to restore mostly everything. This gave me the opportunity to split up my own stuff from stuff i host for friends. They now ‘live in their own VM’. Also, i introduced a VM dedicated to ’services’, like DNS, SpamAssassin, MySQL, etcetera.

While this all seems to work really nice, i am now faced with how to efficiently backup all essential configuration and data from these VMs. Requires a different strategy as with a ’single host setup’.

Happy for now. Performs very well. Finally my hardware is stable.