Dmitry Morozovsky wrote: > On Mon, 21 Apr 2008, Scott Long wrote: > > [snip] > > SL> > At least after simulating drive loss (atacotrol detach, atacontrol attach) > SL> > I can't rebuild ar0: > SL> > marck_at_moleskin:~# > SL> > atacontrol status ar0 > SL> > ar0: ATA RAID1 status: DEGRADED > SL> > subdisks: > SL> > 0 ad16 ONLINE > SL> > 1 ad18 ONLINE > SL> > marck_at_moleskin:~# atacontrol rebuild ar0 > SL> > atacontrol: ioctl(IOCATARAIDREBUILD): Input/output error > SL> > marck_at_moleskin:~# > SL> > > SL> > Or, should I wipe out ar label from the second disk to emulate disk > SL> > replacement? > SL> > > SL> > SL> Generating metadata is not supported, nor is automatic failover to a > SL> spare. If you're interested in working on the code, let me know and > SL> I'll help you get started. > > Yes I'm interested; however, my kernel hacking skills are rather limited and, > as I think, rudimentary ;-) > > But I at least would try. DDF requires knowledge of the topology of the entire system, something that ata-raid (nor g_raid) can provide or even has a concept of. So generating metadata from scratch is massively error-prone in all but the most simple case of a single array and a single set of disks on a single controller. Also, the spare handling currently in ata-raid is limited to array-dedicated spares, and it's limited to assuming that a spare will always be at a fixed position that can be directly mapped into the array via an array in C (note the overloaded use of the term 'array' in this discussion, I'll try to keep it clear when I'm talking about the C programming construct vs the collection of disks construct). These are two separate problems, so I'll describe them separately. For creating/writing metadata, the first thing that I'd do is to save off the existing metadata from any good disks into a buffer pointed to by one of the magic fields in the ar_softc. These fields are actually uint64_t types, but they can be overloaded to hold a pointer. Then when it's time to write out metadata, the saved buffer can be updated and written out, preserving the previously-recorded system-wide information that ata-raid can't provide. Creating metadata from scratch is significantly more cumbersome but certainly not hard aside from the fact the the VDR and PDR records are going to be incomplete. For spares, the existing ata-raid code treats them like unactivated members of the array. They are assigned a static slot in the C array that is used to order the I/O, and just merely skipped over if they aren't activated. This is completely unworkable in anything but the RAID-1 case, but luckily that's the only kind of redundancy that ata-raid supports. However, it also doesn't lend itself well to supporting global spares, something that DDF does support. What I'd do is to add a linked list to the ar_softc that spare disks can be placed on while they are inactive. When a spare is activated, I'd take it off the list and put it into the appropriate slot in the C array that replaces the failed/missing member. I'd also look into adding a global spare linked list as well. These changes are pretty simple, the rest of the work involves just completing the unfinished spare code in the patch that I posted (there's a comment in there that points to where the missing code is). The spare activation code would probably need some work as well to inform the DDF module of the changing role of the spare and its new position within the array. The DDF spec can be found at www.snia.org, it's an open spec. Adaptec uses an older unpublished revision of the spec that has some unfortunate differences. I've compensated for some of those differences in the patch, but there might be others that I haven't encountered yet. Let me know if you have any questions. ScottReceived on Mon Apr 21 2008 - 14:36:29 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:30 UTC