Re: mlx driver related kernel panic in Freebsd 5.2.1-RELEASE

From: Scott Long <scottl_at_freebsd.org>
Date: Mon, 1 Mar 2004 11:06:37 -0700 (MST)
Great to hear, thanks a lot for testing it.  It looks like the problem can
manifest itself when a whole lot of I/O comes into the driver at once.  An
easy way that I've found to generate a pattern like this is to turn
softupdates off, then start 5-10 concurrent copies of large trees with
lots of files.  Then when the pagedaemon does its 30-second interval run,
it'll likely send 500-1000 i/o requests at once to the card.

Scott

On Mon, 1 Mar 2004, Dexter McNeil wrote:

> Scott,
>
> 	This patch seems to have fixed it. I've been trying for the past few
> hours to crash the machine with massive file copies (both local and via
> NFS) and it's stayed up! Many thanks!!
>
> 	Cheers,
> 	Dexter McNeil
>
> On Sat, Feb 28, 2004 at 12:36:34PM -0700, Scott Long wrote:
> > Dexter,
> >
> > I sent out a patch earlier today that might help.  If not, I'll take you
> > up on your offer for remote access to the machine.  The mlx driver
> > underwent a massive structural overhaul over the summer, and I might
> > have missed some edge cases.  Attached is the patch in case you missed
> > it.
> >
> > Scott
> >
> > Dexter McNeil wrote:
> > >Scott,
> > >	I can make remote access to the box and serial console port available
> > >if it will help.
> > >
> > >On Thu, Feb 26, 2004 at 04:42:05PM -0700, Scott Long wrote:
> > >
> > >>Thanks for the report, I'll look at it.
> > >>
> > >>Scott
> > >>
> > >>Dexter McNeil wrote:
> > >>
> > >>>OK, I seem to be able to get FreeBSD 5.2.1-RELEASE to reliably panic with
> > >>>a "panic: free: address 0xd8d5e000(0xd8d5e000) has not been allocated."
> > >>>on
> > >>>the serial console when it crashes.
> > >>>
> > >>>I'm doing two copies of the entire ports collection to seperate
> > >>>subdirectories when the crash occurs. It seems to do this regardless of
> > >>>
> > >>>The machine is an IBM Netfinity 5500 with 4 x 550Mhz P3 Xeon CPUS, 2gigs
> > >>>of ram and a Mylex DAC1100 RAID controller. There are 6 73G disks
> > >>>attached
> > >>>to channel 0 of the controller, 5 disks are in a RAID 5 array, the sixth
> > >>>is a hot spare.
> > >>>
> > >>>Any ideas?
> > >>>
> > >>>
> > >>>
> > >>>A 'trace' at the db> prompt yields:
> > >>>
> > >>>cpuid = 2;
> > >>>Debugger("panic")
> > >>>Stopped at      Debugger+0x55:  xchgl   %ebx,in_Debugger.0
> > >>>db> trace
> > >>>Debugger(c06d1f22,2,c06d07e3,e257dc30,100) at Debugger+0x55
> > >>>panic(c06d07e3,d8d5e000,d8d5e000,0,c06d07a4) at panic+0x156
> > >>>free(d8d5e000,c0705300,1,0,c0522975) at free+0xa3
> > >>>mlx_enquire(c8277000,c,c,c04c3b60,d8) at mlx_enquire+0xef
> > >>>mlx_periodic(c8277000,0,c06d2fed,d8,1) at mlx_periodic+0x174
> > >>>softclock(0,0,c06cfc87,23a,c81d4a98) at softclock+0x1b8
> > >>>ithread_loop(c42f1f00,e257dd48,c06cfaed,311,0) at ithread_loop+0x192
> > >>>fork_exit(c0518350,c42f1f00,e257dd48) at fork_exit+0xb4
> > >>>fork_trampoline() at fork_trampoline+0x8
> > >>>--- trap 0x1, eip = 0, esp = 0xe257dd7c, ebp = 0 ---
> > >>>
> > >
> > >
> >
>
> > Index: mlx.c
> > ===================================================================
> > RCS file: /usr/ncvs/src/sys/dev/mlx/mlx.c,v
> > retrieving revision 1.44
> > diff -u -r1.44 mlx.c
> > --- mlx.c	22 Feb 2004 09:52:46 -0000	1.44
> > +++ mlx.c	28 Feb 2004 17:48:59 -0000
> > _at__at_ -1554,8 +1554,8 _at__at_
> >      if ((mc->mc_complete == NULL) && (mc != NULL))
> >  	mlx_releasecmd(mc);
> >      /* we got an error, and we allocated a result */
> > -    if ((error != 0) && (mc->mc_data != NULL)) {
> > -	free(mc->mc_data, M_DEVBUF);
> > +    if ((error != 0) && (result != NULL)) {
> > +	free(result, M_DEVBUF);
> >  	mc->mc_data = NULL;
> >      }
> >      return(result);
>
>
> --
> The ultimate destination on the journey of life is a hole 6 feet deep.
> Enjoy the journey - the destination is nothing to write home about.
>
>
Received on Mon Mar 01 2004 - 09:04:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:45 UTC