A few months ago a client purchased a new server and asked me to set them up with Linux-based virtual machines to consolidate their disparate hardware. It was a big server, with 128 Gigs of memory. I chose to use Debian Wheezy and QEMU KVM for the virtualization. We’d be running a couple Windows Server instances, and a few Debian GNU/Linux instances on it.
Unfortunately, we ended up encountering some strange problems during heavy disk IO. The disk subsystem is a very fast SAS array, also on board the server, which is a Dell system. The VM’s disks are created as logical volumes on the array using LVM.
Each night, in addition to the normal backups they do within the Windows servers, they also wanted a snapshot of the disk volumes sent off to a backup server. No problem with LVM, even with the VM’s running, when you use the LVM snapshot feature. This does tend to create a lot of disk IO, though.
What ended up happening was that occasionally, every week or two, the larger of the Windows server instances would grind slowly to a halt, and eventually lock up. The system logs on the real server would begin filling with timeouts from Udev – about it not being able to kill children. This would, in turn, effect the whole system – making a reboot of the whole server necessary. Very, very ugly, and very, very embarrassing.
I tried a couple off-the-cuff fixes that were shots in the dark, hoping for an easy fix. But the problem didn’t go away. So I had to dig in and research the problem.
It turns out that Udisk (which is part of Udev) decides how many children it will allow based upon the amount of memory the server is running. In our case, 128G – which is quite a lot. This number of allowed children was a simple one-to-one ratio, based upon memory. However, with this much memory, that many children seemed to be overloading the threaded IO capacity of this monster server, causing blocks, during live LVM snapshots being copied.
What I ended up doing was manually specifying that the maximum number of allowed children for Udev would be 32 instead of the obscene number the inadequate calculation in the Udev code came up with. Since doing this, the server has run perfectly, without a hitch, for a good, long time.
So this is for anyone who may have run into a similar problem. I could find no information about this on the Internet at the time. But I did manage to find how to effect the number of children Udev allows. You can do it while the system is running (which will un-happen once the server is rebooted) or you can put in a kernel boot parameter to effect it, until the Udev developers fix their code to provide a sane value the maximum number of children allowed in systems with a large amount of memory.
At the command line, this is how. I used 32. You might like something different, of course.
udevadm control --children-max=32
And, as a permanent thang, the Linux kernel boot parameter is “udev.children-max=”.
Hopefully this will save some of you some of my headache.