Tag Archives: lvm

LVM Basics – A Quick Intro and Overview by Example

Storage technologies and methods can be a confusing subject when you’re first starting out, and depending on which route you choose, and how you organize them, these storage methods can continue to be confusing even when you know lots – unless you plan and organize well, at the beginning.

Take the time to think about what you want to have before losing yourself in the details of any given technology. And keep this in mind as you create.

Following is very generalized overview of some practical LVM use, presented in simple, no-frills terms. LVM tends to scare people away, seeming complex at first. And it can be, but it doesn’t have to be. This is the easy way, to get you started with LVM if you’re interested. You can make it harder (and perhaps better) later.

What is LVM? Why use it?

GNU/Linux gives you many options. LVM, the Logical Volume Manager is one of them. LVM lets you combine disks and partitions and even arrays of disks together into single filesystems.

LVM is very flexible. None of your storage media or arrays need to match up – you can combine a high performance PCI-e RAID10 array with an external USB 3.0 4TB hard drive if you like. Not always the best idea performance-wise, but often performance isn’t the most important thing.

LVM performs very well, too. Is your RAID0 SATA array running out of space? Add 2 more drives in another RAID0 and combine their capacities together. Or just throw in one new drive. It’s a messy way to do it, but you can. And it would still be fast.

Or perhaps you’re more sober, and want to create a SAN for your network, starting off with a 4-disk RAID10 with smaller drives. But you’re worried the space might fill up, and you can’t have the files span multiple filesystems. If you use LVM, you can just put in a second array at a later date, and expand your logical volume seamlessly.

 LVM Details

From a designer and administrator standpoint, you can think of LVM as having 3 main components:

  1. Physical volumes (devices or arrays).
  2. Physical volumes you bundle into made-up groups.
  3. Logical disks created and carved out of those groups you made.

This is where your planning comes into play (or not if you’re bad).

I’ve gotten into the habit of always using LVM to create the volumes which I will then format into filesystems. This gives me the flexibility to add more capacity, or take some away, at any point in the future.

The Debian distribution has long supported creating and installing itself into LVM volumes. Most of the major distributions now seem to support this as well.

So, using LVM, it’s a simple matter to take your 6TB RAID array and make a nice little root drive of 10GB for Debian, and a 5TB home directory, that you could then share with another 10GB partition you might make later for, say, Fedora.

Of course, you could partition your RAID array with extended partitions to achieve much of the same result, but you would not have nearly the flexibility later.

The Nitty Gritty

Working with LVM requires that you have the LVM tools. In Debian you can get these installed for you automatically if you choose to install onto an LVM volume, or you can get them later with

apt-get install lvm2

Instead of “normal” devices for hard drive volumes,  you’ll get logical devices that are create by the device mapper. In Debian, at least, they live under /dev/mapper and you also get a prettier version of the device names you create under /dev/<volume_group> — and you get to choose the <volume_group>

Creating Physical Volumes

The first thing you need to do is decide which devices you’d like to take over as LVM-controlled beasties. You can also specify individual partitions of disks if you like. The point being, you don’t have to even partition a disk first – you can specify the whole device if you like. Or partition it if you prefer. People will always argue over what’s best, and there is no clear winner. Which means, do what you like!

Let’s say you have 2 SATA3 drives, /dev/sdb and /dev/sdc. And you want to use them for LVM. The first thing to do is claim those physical volumes for LVM:

pvcreate /dev/sdb /dev/sdc

And lets suppose later you decide you also want to use your RAID 1 drive (/dev/md0) for LVM as well. You can designate any physical volume to be used with LVM, at any time.

pvcreate /dev/md0

Oh, I also have a USB 3.0 drive I might want to do something with:

pvcreate /dev/usb_drive_0

Do be careful though – consider any data on these destroyed.

Creating Volume Groups

Once you have some physical volumes in your system designated for LVM use, you can start grouping them up together. Creating volume groups doesn’t give you anything you can format into a filesystem, it just groups together whichever physical volumes you want to use into a named group that you can reference them by as one.

Think of it as the first abstraction layer — which of these physical devices do I want to group together and use as a big pool of space, from which I can make my littler hard drives.

You might want to consider how fast the drives perform when grouping. For example, you probably don’t want to group your super fast RAID array in with your USB drive. But maybe you’re fine grouping your RAID array in with your other SATA drives.

Yes, yes indeed, we’re fine with grouping our RAID in with our SATA drives. It will be fast. And the USB will be slow. How about we name these groups to reflect the fast and slowness of each. Maybe I can use the slow for backups?

vgcreate fast /dev/md0 /dev/sdb /dev/sc

That’s it. Easy to create a volume group. Just pick a name, like “fast” as we did here, and then list the volumes you want to use in it — ones that you created with pvcreate. And if you’re wondering, yes you can skip the pvcreate part above — but don’t — until you’re very sure that all this structure stuff is going to stick in your head.

Now, even though we only have one device in our USB drives, we may later add more nonetheless. So why not put it into an LVM group now – then later if we need to expand our backup area with more space, we can add a second USB drive, and just add it to this same group for instant gratification.

We’ll call this one slow:

vgcreate slow /dev/usb_drive_0

So we now have two spaces we can work with – the “fast” space, and the “slow” space. These are the volume groups. And we no longer have to care about which devices make them up. Well, until something goes wrong.

I suppose it’s worth noting that LVM provides no redundancy or backups, by default, although there are some great mechanisms built into it for doing so later, in some… interesting ways. So it’s really best to make sure your physical volumes have the level of redundancy and protection you want, before making those physical volumes into logical ones. I almost never make an LVM volume out of anything but a RAID array, unless I just don’t care that much about losing data, because I’m so awesome about backups (mm hmm).

Creating Logical Volumes

Now you can create logical volumes in your volume groups, and these logical volumes you can format just like hard drive partitions.

When you create your groups, LVM will allocate all the space possible on those devices, and make it available to you to create your logical volumes in. If you ever want to see how much space you have:

vgdisplay fast

You’ll get to see how much space you have, and also how much space has already been allocated to logical volumes.

Now, lets say I want to have a 100GB “disk” for my database files. Fast disk access is important for database work, so we’ll put that on the “fast” volume group:

lvcreate -n database -L100G fast

This will create a logical device of 100GB size called /dev/fast/database (at least in Debian). A more pedantic device name is also created called /dev/mapper/fast-database but I prefer using the first naming convention.

You can then use this newly-created device just like you would any hard drive. You can partition it if you like, or you can just create a filesystem on it:

mkfs.ext4 /dev/fast/database

Now, if you decide you just must partition these, they can be a little tricky to mount. Using a tool called partx you can extract that partition table out into devices you can work with, but we won’t go into that just now, here.

But if you skip the partitioning bit, these logical volumes mount just like normal filesystems, except really, they can be coming off of any number of drives in your volume group.

mount -t ext4 /dev/fast/database /var/mydatabase

Easy as pie! (when you don’t have to make the crust)

The same holds true for your slow USB drive:

lvcreate -n backups -L 2T slow
mkfs.ext4 /dev/slow/backups
mount -t ext4 /dev/slow/backups /mnt/backups

Here we created a “drive” of 2TB called “backups”, formatted it, and mounted it at /mnt/backups. This would be our USB drive.

Adding More Drive Space to an LVM Logical Volume

Now that we’re demystified, and have been using our system just fine for weeks, we ended up filling up our /dev/slow/backups with backups. Let’s say our USB drive was 3TB. We only allocated 2TB to our logical volume for backups, so really we have 1TB left free in the “slow” volume group. We could see this with

vgdisplay slow

So, if we wanted, we could just increase the size of our logical volume /dev/slow/backups to eat up that extra 1TB that exists in that volume group, and so give us the full capacity of that USB drive:

lvextend -L+1T /dev/slow/backups

Here we give a relative size in the -L parameter, saying we want to add 1TB to the already-allocated size of 2TB, making a total of 3TB. Of course, this doesn’t make the filesystem bigger, just the “disk” the filesystem is on. But ext4 is easy to resize:

resize2fs /dev/slow/backups

By not specifying a specific size, resize2fs will just make the filesystem as large as it can on that “drive”. And you don’t have to take the disk offline either — this is an online resize, and I’ve never had it fail. But of course you should do it offline. But I never will.

You can also shrink filesystems and volumes, but I’m not going to go into any of that here because shrinking filesystems is unnatural.

But suppose even 3TB is not enough for your backups. Well, it’s easy to add a new device to a volume group, and make it available for logical volume to eat up in its gluttony.

You buy another USB drive! So now we need to add it to the “slow” volume group. You got a good deal on 4TB drive this time. So you have your original 3TB one, and now a 4TB one you’re going to use to expand your storage. Assuming it’s assigned /dev/usb_drive_1

First, create the physical volume, as usual:

pvcreate /dev/usb_drive_1

Then add it to your “slow” volume group:

vgextend slow /dev/usb_drive_1

After you do that, if you look with

vgdisplay slow

You’ll see that you now have a whopping 7TB volume group called “slow”.

Then you just do the same as above to extend your logical volume, or “drive”, for your backups:

lvextend -L+3T /dev/slow/backups
resize2fs /dev/slow/backups

Et voila! You now have a 6TB backup “drive”, because you didn’t use the full 4TB off the new one. You left 1TB free in the “slow” volume group, because you’re not always a complete hog, and you may like to use that 1TB for some other volume in the future, like:

lvcreate -n pron -L1T slow
mkfs.ext4 /dev/slow/pron
mount -t ext4 /mnt/relax /dev/slow/pron

Because, well, if not a hog, then perhaps a pig. Oink.  And that uses the last of the drives.

Conclusion

You can, of course, remove devices from LVM groups as well. You just need to make sure you have enough space in the volume group to do so. For example, if you have 2 USB drives, one 3TB and one 4TB, making a total of 7TB — if you’re using up 6TB in logical volumes, you won’t be able to take away either of those USB drives, unless you can shrink down your logical volumes first (after shrinking your filesystem first, unless you like agony).

The point being, LVM is quite flexible.

One of my other favorite features is the ability to take a live snapshot. This is great for backing up whole disk images of a live-running system. You create an LVM “snapshot”, and then you can dump that image anywhere you like, and the filesystem will be in a consistent state, even with the system running. It’s wonderful for virtual machine images especially.

But I’m not going into that either here, just  yet.

Hopefully this might have helped you a little. I remember when I was first looking at LVM it was a confusing mish-mash of all these different options, and nothing spelled it out simply. Hopefully, this has managed to do so, if you’re looking to get started with it, for play or production.

And as always, check out the man pages. There are lots of other places out there too with more detailed and specific use cases and feature examples.

Honestly, using LVM has changed everything for me. It’s certainly worth looking at. Best to you!

How To Deal With Udev Children Overwhelming Linux Servers With Large Memory

A few months ago a client purchased a new server and asked me to set them up with Linux-based virtual machines to consolidate their disparate hardware. It was a big server, with 128 Gigs of memory. I chose to use Debian Wheezy and QEMU KVM for the virtualization. We’d be running a couple Windows Server instances, and a few Debian GNU/Linux instances on it.

Unfortunately, we ended up encountering some strange problems during heavy disk IO. The disk subsystem is a very fast SAS array, also on board the server, which is a Dell system. The VM’s disks are created as logical volumes on the array using LVM.

Each night, in addition to the normal backups they do within the Windows servers, they also wanted a snapshot of the disk volumes sent off to a backup server. No problem with LVM, even with the VM’s running, when you use the LVM snapshot feature. This does tend to create a lot of disk IO, though.

What ended up happening was that occasionally, every week or two, the larger of the Windows server instances would grind slowly to a halt, and eventually lock up. The system logs on the real server would begin filling with timeouts from Udev – about it not being able to kill children. This would, in turn, effect the whole system – making a reboot of the whole server necessary. Very, very ugly, and very, very embarrassing.

I tried a couple off-the-cuff fixes that were shots in the dark, hoping for an easy fix. But the problem didn’t go away. So I had to dig in and research the problem.

It turns out that Udisk (which is part of Udev) decides how many children it will allow based upon the amount of memory the server is running. In our case, 128G – which is quite a lot. This number of allowed children was a simple one-to-one ratio, based upon memory. However, with this much memory, that many children seemed to be overloading the threaded IO capacity of this monster server, causing blocks, during live LVM snapshots being copied.

What I ended up doing was manually specifying that the maximum number of allowed children for Udev would be 32 instead of the obscene number the inadequate calculation in the Udev code came up with. Since doing this, the server has run perfectly, without a hitch, for a good, long time.

So this is for anyone who may have run into a similar problem. I could find no information about this on the Internet at the time. But I did manage to find how to effect the number of children Udev allows. You can do it while the system is running (which will un-happen once the server is rebooted) or you can put in a kernel boot parameter to effect it, until the Udev developers fix their code to provide a sane value the maximum number of children allowed in systems with a large amount of memory.

At the command line, this is how. I used 32. You might like something different, of course.

udevadm control --children-max=32

And, as a permanent thang, the Linux kernel boot parameter is “udev.children-max=”.

Hopefully this will save some of you some of my headache.