On Tue, 3 Feb 2004, Doug White wrote: > On Tue, 3 Feb 2004, Cy Schubert wrote: > > > > Why nothing ? > > > > Iostat doesn't see the I/Os because RAID rebuilds occur within the > > controller, the I/Os are not initiated in the O/S nor any of its utilities, > > therefore the FreeBSDS kernel doesn't see them. The O/S doesn't see the > > I/Os. Atacontrol see 3% because it specifically queries the controller for > > that information. > > ATARAID is purely OS driven. The OS issues the writes for the rebuild, as > well as failure detection and mirroring. You're thinking of SCSI > controllers, or 3ware controllers. > > Since the rebuild I/O is driven by the kernel, it bypasses the normal I/O > path and thus doesn't register in the stats. If you try to do heavy I/O > to the devices, you'll find the performance is reduced. Drivers should register all interesting i/o transactions, but GEOM now hides even more details from them than before so iostat often shows bogus stats. E.g., if you try to write 256K-blocks to an ad (non raid) disk, then there are many layers of deblocking and enblocking and iostat shows a wrong layer: - first, physio() knows that you don't really want the 256K-blocks that you asked for (this is a bug for some devices but not disks). It deblocks to block size dev->si_iosize_max. si_iosize_max is supposed to be device-specific, but it is now just bogus. GEOM always sets it to MAXPHYS (128K) for disks. si_iosize_max is bogus for other reasons. Disk devices need to support reading blocks of sizes up to (VM_INITIAL_PAGEIN * PAGE_SIZE) bytes (64K on i386 and 128K on alphas...) for execve() to work. The size for this on alphas is accidentally the same as MAXPHYS, so si_iosize_max must be MAXPHYS or larger for non-broken disk devices and there is no point in having it. This is mostly fixed in -current, but in RELENG_4 most disk devices advertise a bogus limit of DFLTPHYS = 64K. They had better support MAXPHYS = 256 and deblock it internally to support alphas. The acd driver RELENG_4 advertises a bogus limit of 32K or 126K but actually does 128K or possibly more without deblocking. - second, GEOM registers the i/o's with devstat with the sizes that it gets from physio() (128K in this example). - third, GEOM deblocks the 128K blocks to the maximum sizes advertised by the driver in the new d_maxsize struct member. The ad driver could handle 128K-blocks without deblocking in RELENG_4 (this is an old optimization by dyson, except the maximum was 127K or 127.5K in the first version of it because some drivers were claimed to not like 128K). Now the ad driver can only handle 64K blocks, so GEOM turns the 128K-blocks into 64K ones. - fourth, there might be another layer of deblocking in the driver (there isn't one for ad AFAIK). iostat would not show it. - fifth, there is more deblocking in the drive. Sectors are normally 512 bytes, at least virtually, so there is a lot of deblocking to get them from a 64K-block. iostat just doesn't support this level. The best block size to use and the best deblocking strategies are not clear, but iostat should show the size sent to the hardware (or sizes at all levels) so that the best sizes and deblocking strategies can be chosen. BruceReceived on Wed Feb 04 2004 - 01:44:19 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:41 UTC