Lars Eggert wrote: > On 4/30/2003 4:28 PM, Terry Lambert wrote: > > If you are panic'ing, and it's repeatable, then you should > > minimally post: > > Done already: > > Message-ID: <3EAC5950.7040306_at_isi.edu> > Date: Sun, 27 Apr 2003 15:27:28 -0700 > From: Lars Eggert <larse_at_ISI.EDU> > Subject: Re: Kernel panic during portupdrade [ffs_blkfree: > freeing free block] > > (Panic message was in an earlier post to the same thread.) FWIW, Message-ID does me no good; it's not a searchable field for me. If you are going to give me anything other than a URL for the message in the mailing list archive, ou probably want to give me (in order of importance): 1) The mailing list it was sent to 2) The date 3) The sender 4) The subject -- That's yours, not Kent's. It's pretty obvious from looking at your message and the code what's happening there: you are trying to free a frag of a block whose bit is not set in the cylinder group bitmap. To fix it, you have to ask yourself how it's even possibe to get that situation in the first place. Theoretically, this is not permitted to happen, because the CG bitmap is supposed to be written out last. Practically, there are several ways to cause this in -current; any one of them could be your culprit (e.g. you are running with the sched_sync() patches for fsync that were posted, or you crashed and used a BG fsck instead of a full fsck, and trusted it to do the right thing, etc.). Let's assume that none of those are true at this point, and that you can repeat the problem after doing a full fsck on the FS in question from sngle user mode, and rebooting. So... The first question we need to answer is why sched_sync is your callout in fork_exit(); seems pretty daft to me. I would think this was indicative of stack corruption... or, it's indicative of something being allowed to run tat shouldn't run while a cleanup is in pogress, but not yet committed to the soft updates list (meaning the CG bit should have been set, but wasn't). Permit me to suspect 1.193 and 1.192 of /sys/kern/kern_fork.c, and 1.442 and 1.443 of /sys/kern/vfs_subr.c; particularly, the conversion from tsleep() to msleep(). A possible workaround might be to modify fork_exit(); there's code in the function that reads: if (PCPU_GET(switchtime.sec) == 0) binuptime(PCPU_PTR(switchtime)); PCPU_SET(switchticks, ticks); mtx_unlock_spin(&sched_lock); /* * cpu_set_fork_handler intercepts this function call to * have this call a non-return function to stay in kernel mode. * initproc has its own fork handler, but it does return. */ KASSERT(callout != NULL, ("NULL callout in fork_exit")); callout(arg, frame); Change it to read: if (PCPU_GET(switchtime.sec) == 0) binuptime(PCPU_PTR(switchtime)); PCPU_SET(switchticks, ticks); /* * cpu_set_fork_handler intercepts this function call to * have this call a non-return function to stay in kernel mode. * initproc has its own fork handler, but it does return. */ KASSERT(callout != NULL, ("NULL callout in fork_exit")); callout(arg, frame); mtx_unlock_spin(&sched_lock); Instead. Let me know what happens; it will probably complain about an LOR or a lock being held that's "not supposed to be held, because otherwise the kernel wouldn't panic" or whatever... -- TerryReceived on Wed Apr 30 2003 - 17:18:23 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:05 UTC