Re: Logical volume management

From: Eric Anderson <anderson_at_centtech.com>
Date: Fri, 18 Nov 2005 06:39:09 -0600
Brian Candler wrote:
> Vinum's manpage makes my head spin. I was wondering if anyone had considered
> implementing something a bit more straightforward and also more dynamic.
> 
> Suppose you:
> 
> (1) Divide all your disks up-front into equal sized chunks, say 4MB.
> 
> (2) Use an indirection table to map each volume into an arbitary set of
>     these chunks across all available disks.
> 
> (3) Store the indirection table at the end of a partition, as other GEOM
>     modules do for their metadata, and cache it in RAM.
> 
> (e.g. a 160GB drive, divided into 4MB blocks, each of which has a 32-bit
> indirection table entry, would require only 160KB of indirection table)
> 
> Why do this?
> 
> - You can install a system with minimal /, /usr, /var and /home, and then
> grow each one in small increments as needed just by adding spare chunks.
> With vinum you would end up with an increasingly complex configuration with
> more and more subdisks, since each subdisk must be a contiguous range of a
> physical drive. If you decide to get rid of a volume, then you need to keep
> track of those subdisk fragments. I'm not sure if it's possible to take an
> unused subdisk and split it so you can assign part of the free space to
> another volume. Even if you can, this still means more subdisk
> fragmentation.
> 
> With the above scheme an unused volume just returns its chunks into the pool
> for reallocation.
> 
> - You can identify 'hot' chunks and move them between disks. This is a lot
> more flexible than fixed striping. Unlike striping, it could distribute load
> between unevenly matched devices (e.g. 10GB on one disk and 20GB on
> another). It could also migrate 'hot' data to faster devices, such as a
> battery-backed RAM disk[*]. With the right tools, this could all happen
> automatically.
> 
> - Mapping volumes in fixed chunks in this way lends itself well to
> visualisation, e.g. all chunks belonging to the same volume can be shown as
> blocks in the same colour.
> 
> - What I'm suggesting may or may not look like Linux's LVM; I've never used
> that. If its data structure is suitable, we can just use that and gain some
> compatibility for multi-boot systems.
> 
> I guess you could work this way in vinum, dividing all your storage up front
> into 4MB subdisks, but it doesn't sound like fun to me.
> 
> I also guess there's a lot of devil-in-the-details to do with marking a
> volume as 'up' or 'down'; but hopefully mirroring and RAID could be
> delegated to other GEOM modules, leaving us just with logical
> {volume,extent} to physical {drive,extent} mapping to do.
> 
> Has something like this been proposed, discussed and/or discarded before?

I've been sketching out nearly the exact same thing over the past few 
weeks!  My goals were to come up with a way to utilize block devices in 
a very pliable way, that allows growing volumes, adding more block 
storage to a pool, etc, like you've mentioned above.

One of the issues I was hoping to solve, is the "can't grow a stripe 
onto more disks" kind of thing.   I started coming up with a featureset 
for a new volume manager, using GEOM as the base.  Some of them were:

- ability to grow volumes online
- volume migration (online)
- volume snapshots (online)
- block pooling (to allow adding more blocks from a disk to the pool)
- auto block allocation (assigning blocks from the pool to a volume as 
needed)
- auto block promotion (moving most frequently used blocks to faster 
block storage devices, and/or auto mirroring those blocks on many 
devices for increased speed)

It would be nice to be able to create an arbitrarily large volume, which 
  only uses these volume blocks (you call them chunks) as the volume 
gets filled.  This way, you could create a 2Tb volume, with only a 
single 200Gb drive, then as you neared the 200Gb used mark, you could 
add another disk, and grow on to it, or even add 5 disks, and it could 
stripe the data across them, or mirror, etc.  You could also migrate 
volume blocks from one device to others, or have the volume manager 
automatically move the MFU (most frequently used) blocks to multiple 
volume block providers for striping+mirroring to gain extra performance.

Maybe we should take this to freebsd-geom_at_?

Eric





-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.
------------------------------------------------------------------------
Received on Fri Nov 18 2005 - 11:39:33 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:47 UTC