> I must admit that I have often been tempted to move the queue+sorting > out of the drivers because they all, more or less, do the exact > same thing. > > For one thing, that would simplify any ABI for changing disksort > algorithm (which should be per drive and not per system). Agreed. For example, IBM's AIX implements per-device disksort algorithms. If you've got ancient SSA drives, they will sort requests differently than disks in an ESS or FAStT disk array. > Finally, I am still pretty convinced that if somebody sat down and > did some real-life measurements, they would find that disk-sorting > has a different task these days where the drives have much more and > much more detailed knowledge about the physics of the situation. > > Over the years I have read quite a bit of IBM's mainframe docs and > research on this topic, and they have found a lot of interesting > things which all more or less are present in todays zSeries. > > Much of the work in recent years have tended to move the other > direction, instead of sorting the work before shipping to the disk, > disk state is exported so it can shape the workload. For instance > average I/O time estimates are now used to affect block allocation > in DB2 databases. This is only true if you're talking about DB2 on zSeries. [ Disclaimer: I'm a DB2 LUW developer. ] DB2 on LUW (Linux/Unix/Windows) does exactly the opposite in most cases. I/Os are generally sent to the OS in unsorted fashion (unless it's a big batch of sequential pages, in which case they're sent in sorted order), and we let the OS handle things appropriately. Most OSes are relatively intelligent and share information throughout the disk/LVM subsystem with respect to device queue properties and I/O request lists, and can make the best decisions on how to sort. > The one place where disk-sorting _really_ makes a huge impact > is RAID5 but very specialized sorting algorithms are necessary > there and they need intimate access to the internals of the > RAID5 engine. Yet another reason to defer the sorting algorithm to the driver or the hardware itself. -- Matt EmmertonReceived on Tue Jul 12 2005 - 23:15:18 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:38 UTC