Re: Panic during shutdown (cause identified)

From: Kostik Belousov <kostikbel_at_gmail.com>
Date: Thu, 2 Jul 2009 22:44:44 +0300
On Thu, Jul 02, 2009 at 11:37:01AM -0500, Greg Rivers wrote:
> On Thu, 2 Jul 2009, Kostik Belousov wrote:
> 
> >>>Also, please describe the load on the machine,
> >>>
> >>
> >>It happens regardless of the load.  For example, just booting 
> >>multi-user and immediately running shutdown (either by logging in or 
> >>pressing the ACPI power button) triggers the panic.
> >No, it does not happen regardless of the load. The patch was tested on 
> >the semi-standard set of programs run on the system, and all seen 
> >accounting mistakes were fixed.
> >
> 
> I don't know what patch you're referring to.  Are you saying this issue 
> was seen before and thought to have been fixed?
The issue is an indicator of the bug somewhere else, in the code that
does precise swap accounting. I committed that approximately one week
ago. The patch I am referring to is the r194766.
> 
> 
> >You have some process that does exhibit the behaviour causing error in 
> >swap accounting. I think for start you could just show me ps auxww 
> >output, in private, if you prefer.
> >
> 
> I can save you the trouble of reading ps output.  Based on your insight 
> that the problem is with a particular process, I eliminated variables from 
> /etc/rc.conf by trial, and have determined that it's the amd(8) 
> automounter that's causing the panic.  When I remove 'amd_enable="YES"', 
> no more panic.
> 
> 
> >> The panic message on the console does not show the process.  Can that 
> >> be determined from kgdb?  If so, how?
> >It does show the process, like
> >KDB: enter: panic
> >[thread pid 32021 tid 100598 ]
> >
> 
> Yes, ordinarily such message is shown, but it is not shown for this panic. 
> Also with this panic, about half the time the machine locks up hard just 
> before, during, or after the core dump.
> 
> 
> >BTW, did you saw the kernel messages like negative vmsize for uid = XXX ?
> >
> 
> No, there have been none of those.
> 
> 
> Please let me know if I can help with further testing/debugging. BTW, I 
> did not customize the amd configuration; I was using the stock 
> configuration from the base system.

The information you provided about amd(8) causing the problem was crusial.
The issue is that amd locks its pages with mlockall(2), and the code
neglected to account the wired mappings, but did not forgot to decrease
their swap share on unmapping.

Patch below fixed the issue for me.

diff --git a/sys/vm/vm_extern.h b/sys/vm/vm_extern.h
index 7bacde4..ec21a3a 100644
--- a/sys/vm/vm_extern.h
+++ b/sys/vm/vm_extern.h
_at__at_ -55,7 +55,8 _at__at_ vm_map_t kmem_suballoc(vm_map_t, vm_offset_t *, vm_offset_t *, vm_size_t,
 void swapout_procs(int);
 int useracc(void *, int, int);
 int vm_fault(vm_map_t, vm_offset_t, vm_prot_t, int);
-void vm_fault_copy_entry(vm_map_t, vm_map_t, vm_map_entry_t, vm_map_entry_t);
+void vm_fault_copy_entry(vm_map_t, vm_map_t, vm_map_entry_t, vm_map_entry_t,
+    vm_ooffset_t *);
 void vm_fault_unwire(vm_map_t, vm_offset_t, vm_offset_t, boolean_t);
 int vm_fault_wire(vm_map_t, vm_offset_t, vm_offset_t, boolean_t, boolean_t);
 int vm_forkproc(struct thread *, struct proc *, struct thread *, struct vmspace *, int);
diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c
index 43743f4..579cf49 100644
--- a/sys/vm/vm_fault.c
+++ b/sys/vm/vm_fault.c
_at__at_ -1126,11 +1126,9 _at__at_ vm_fault_unwire(vm_map_t map, vm_offset_t start, vm_offset_t end,
  *		entry corresponding to a main map entry that is wired down).
  */
 void
-vm_fault_copy_entry(dst_map, src_map, dst_entry, src_entry)
-	vm_map_t dst_map;
-	vm_map_t src_map;
-	vm_map_entry_t dst_entry;
-	vm_map_entry_t src_entry;
+vm_fault_copy_entry(vm_map_t dst_map, vm_map_t src_map,
+    vm_map_entry_t dst_entry, vm_map_entry_t src_entry,
+    vm_ooffset_t *fork_charge)
 {
 	vm_object_t backing_object, dst_object, object;
 	vm_object_t src_object;
_at__at_ -1163,10 +1161,12 _at__at_ vm_fault_copy_entry(dst_map, src_map, dst_entry, src_entry)
 	VM_OBJECT_LOCK(dst_object);
 	dst_entry->object.vm_object = dst_object;
 	dst_entry->offset = 0;
-	if (dst_entry->uip != NULL) {
-		dst_object->uip = dst_entry->uip;
+	if (fork_charge != NULL) {
+		dst_object->uip = curthread->td_ucred->cr_ruidinfo;
+		uihold(dst_object->uip);
 		dst_object->charge = dst_entry->end - dst_entry->start;
 		dst_entry->uip = NULL;
+		*fork_charge += dst_entry->end - dst_entry->start;
 	}
 	prot = dst_entry->max_protection;
 
diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c
index 82d37e6..ea6f713 100644
--- a/sys/vm/vm_map.c
+++ b/sys/vm/vm_map.c
_at__at_ -2909,7 +2909,8 _at__at_ vm_map_copy_entry(
 		 * Cause wired pages to be copied into the new map by
 		 * simulating faults (the new pages are pageable)
 		 */
-		vm_fault_copy_entry(dst_map, src_map, dst_entry, src_entry);
+		vm_fault_copy_entry(dst_map, src_map, dst_entry, src_entry,
+		    fork_charge);
 	}
 }
 

Received on Thu Jul 02 2009 - 17:44:51 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:51 UTC