I've tracked down the source of the memory corruption in -current that results when booting with various CD and DVD drives (especially the ones that come with Thinkpads including T23, R32, T41, etc.) The panic is obvious when running with INVARIANTS ("memory modified after free") but not so obvious in other configurations. For instance, without INVARIANTS, part of the rt_info structure is corrupted on my wireless card, resulting in a panic during ifconfig on boot. This is likely the source of other problems, including phk's ACPI panic (again, only triggered when booting with the CD drive in the bay.) The root problem is that ata_timeout() fires and calls ata_pio_read() which overwrites 512 bytes random memory. There are actually two bugs here that overwrite memory. The code path is as follows: 1. ata runs an IDENTIFY command on each drive. It reaches this stack: ata_getparam() ata_identify_devices() ata_boot_attach() 2. ata_getparam() allocates a request and runs it: ata_alloc_request() loop on retries (2 max) fill out an immediate read request for 512 bytes (DEV_BSIZE) *** Bug 1: transfersize is 512 bytes but sizeof(struct ata_request) is much less (~80 bytes). ata_queue_request() starts the request and arms a timeout 3. Completion for first request results in FAILURE - no interrupt. ata_completed() calls ata_generic_interrupt() which calls ata_pio_read() *** Bug 1: here's where the 512 - sizeof(struct ata_request) bytes following "request" are overwritten. . The completion code also updates request->donecount (an upward-counting residual) from 0 to 512 after the request has been completed. 4. ata_getparam runs through the loop again and starts another identify *** Bug 2: donecount is now 512 so ata_pio_read() will write on request->data + 512 with 512 bytes. Whatever follows request->data + 512 is overwritten. I've tested the attached patch and it fixes the memory overwrite (tested with hardware watchpoints) and obviously, the panics. It resets request->donecount in each loop iteration (bug 2). It also mallocs DEV_BSIZE bytes for the transfer (bug 1). An alternate fix for bug 1 is to lower the transfer size to sizeof(struct ata_request) but there may be some minimum transfer length requirements that force you to use DEV_BSIZE. Commit approval requested. -Nate Index: ata-all.c =================================================================== RCS file: /home/ncvs/src/sys/dev/ata/ata-all.c,v retrieving revision 1.215 diff -u -r1.215 ata-all.c --- src/sys/dev/ata/ata-all.c 12 Jul 2004 10:50:49 -0000 1.215 +++ src/sys/dev/ata/ata-all.c 30 Jul 2004 22:30:55 -0000 _at__at_ -549,7 +549,7 _at__at_ if (!atadev->param) atadev->param = malloc(sizeof(struct ata_params), M_ATA, M_NOWAIT); if (atadev->param) { - request = ata_alloc_request(); + request = malloc(DEV_BSIZE, M_ATA, M_NOWAIT | M_ZERO); if (request) { int retries = 2; while (retries-- > 0) { _at__at_ -561,11 +561,12 _at__at_ request->data = (caddr_t)atadev->param; request->bytecount = sizeof(struct ata_params); request->transfersize = DEV_BSIZE; + request->donecount = 0; ata_queue_request(request); if (!(error = request->result)) break; } - ata_free_request(request); + free(request, M_ATA); } if (!error && (isprint(atadev->param->model[0]) || isprint(atadev->param->model[1]))) {Received on Fri Jul 30 2004 - 20:49:27 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:04 UTC