Re: dhclient taking all cpu

From: Brooks Davis <brooks_at_one-eyed-alien.net>
Date: Wed, 27 Jul 2005 12:57:21 -0700
On Wed, Jul 27, 2005 at 02:47:21PM -0500, Eric Anderson wrote:
> Brooks Davis wrote:
> >On Wed, Jul 27, 2005 at 02:35:06PM -0500, Eric Anderson wrote:
> >
> >>Brooks Davis wrote:
> >>
> >>>On Tue, Jul 26, 2005 at 04:39:33PM -0700, Brooks Davis wrote:
> >>>
> >>>
> >>>>On Tue, Jul 26, 2005 at 06:53:17PM -0400, Jung-uk Kim wrote:
> >>>>
> >>>>
> >>>>>On Tuesday 26 July 2005 04:00 pm, Wilko Bulte wrote:
> >>>>>
> >>>>>
> >>>>>>On Tue, Jul 26, 2005 at 12:33:24PM -0700, Brooks Davis wrote..
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>On Mon, Jul 25, 2005 at 10:39:09PM -0400, Mike Jakubik wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>>On Mon, July 25, 2005 9:54 pm, Brooks Davis said:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>>>Probably something wrong with your interface, but you
> >>>>>>>>>>>havent't provided any useful information so who knows.  At
> >>>>>>>>>>>the very least, I need to know what interface you are
> >>>>>>>>>>>running on, something about it's status, and if both
> >>>>>>>>>>>dhclient processes are running.
> >>>>>>>>>>
> >>>>>>>>>>The interface is xl0 (3Com 3c905C-TX Fast Etherlink XL), and
> >>>>>>>>>>it worked in this machine fine for as long as i remember.
> >>>>>>>>>>This seems to have happened since a recent cvsup and
> >>>>>>>>>>buildworld from ~6-BETA to 7-CURRENT. I rebooted three
> >>>>>>>>>>times, and the problem occured rougly a minute after bootup.
> >>>>>>>>>>On the fourth time however, it seems to be ok so far.
> >>>>>>>>>
> >>>>>>>>>That sounds like a problem with the code that handles the
> >>>>>>>>>link state notifications in the interface driver.  The
> >>>>>>>>>notifications are a reletivly new feature that we're only now
> >>>>>>>>>starting to use heavily so there are going to be bumps in the
> >>>>>>>>>road.  It would be intresting to know if you see link state
> >>>>>>>>>messages promptly if you plug and unplug the network cable.
> >>>>>>>>
> >>>>>>>>It seems to be back at it again, this time it took longer to
> >>>>>>>>kick in. Here is a "ps auxw|grep dhclient" :
> >>>>>>>>
> >>>>>>>>_dhcp      219 93.5  0.2  1484  1136  ??  Rs    8:49PM  
> >>>>>>>>5:06.00 dhclient: xl0 (dhclient)
> >>>>>>>>root       193  0.0  0.2  1484  1088  d0- S     8:49PM  
> >>>>>>>>0:00.02 dhclient: xl0 [priv] (dhclient)
> >>>>>>>>
> >>>>>>>>top:
> >>>>>>>>
> >>>>>>>>PID USERNAME      THR PRI NICE   SIZE    RES STATE    TIME  
> >>>>>>>>WCPU COMMAND 219 _dhcp           1 129    0  1484K  1136K RUN  
> >>>>>>>>9:33 94.24% dhclient
> >>>>>>>>
> >>>>>>>>Nothing in dmesg about link state changes on xl0. Unplugging
> >>>>>>>>and replugging the network cable results in link state
> >>>>>>>>notification within a couple seconds.
> >>>>>>>
> >>>>>>>Could you see what happens if you run dhclient in the foreground?
> >>>>>>>Just running "dhclient -d xl0" should do it.  I'd like to know
> >>>>>>>what sort of output it's generating.
> >>>>>>
> >>>>>>In my case it is not displaying anything:
> >>>>>>
> >>>>>>
> >>>>>>chuck#dhclient -d ath0
> >>>>>>DHCPREQUEST on ath0 to 255.255.255.255 port 67
> >>>>>>DHCPACK from 192.168.5.254
> >>>>>>bound to 192.168.5.20 -- renewal in 21600 seconds.
> >>>>>>
> >>>>>><nothing>
> >>>>>>
> >>>>>>I can tell the phenomenon occurs when my laptop fan springs to
> >>>>>>life:
> >>>>>>
> >>>>>>CPU states: 96.5% user,  0.0% nice,  2.7% system,  0.8% interrupt, 
> >>>>>>0.0% idle
> >>>>>>Mem: 48M Active, 28M Inact, 50M Wired, 680K Cache, 34M Buf, 115M
> >>>>>>Free Swap: 257M Total, 257M Free
> >>>>>>
> >>>>>>PID USERNAME  THR PRI NICE   SIZE    RES STATE    TIME   WCPU
> >>>>>>COMMAND 719 _dhcp       1 129    0  1384K  1092K RUN      2:14
> >>>>>>93.55% dhclient 607 root        1  98    0 34584K 21212K select  
> >>>>>>0:09  1.81% Xorg 663 wb          4  20    0 46712K 40224K kserel  
> >>>>>>0:27  0.00% mozilla-bin 503 root        1   8    0  1184K   796K
> >>>>>>nanslp   0:07  0.00% powerd
> >>>>>>
> >>>>>>Took (best guess) approx 5-10 minutes for the effect to kick in.
> >>>>>
> >>>>>FYI, I have the same issues with bge(4) and ndis(4).
> >>>>
> >>>>I've seen it on ath and em interfaces now, but am not sure what's going
> >>>>on. and have no idea how to reproduce the problem.  As also reported by
> >>>>Bakul Shah, we seem to be getting into a state where receive_packet() is
> >>>>spinning.  I'm not seeing an obvious way for this to be possible.
> >>>
> >>>
> >>>I think I've found it.  There was a really odd typo (= instead of +) in
> >>>the code that handles undersized captures on the bpf socket.  Please try
> >>>the following patch and see if it solves the problem.  I'm testing here,
> >>>but I don't have a reliable way to trigger the bug.  The fix is fairly
> >>>obvious so I'll commit it to head shortly.
> >>
> >>It's been 20 minutes without any issues - I think that did it.  Thanks!
> >
> >
> >Great!  Thanks for the report.
> 
> I give up.  Now it's back to it's dirty ways.  Ran for 22 mins without 
> issue (with -d option), so I reran without the -d, and it spiked within 
> a few minutes.
> 
> 
> I'll now wait until someone else claims it works before commenting on it 
> since my computer seems to enjou making me look bad. :)

Crap.  You did remember to install the patched version before running it
the normal way, right?

If you could compile it with debugging and get me a dump and executable
that would help.

-- Brooks

-- 
Any statement of the form "X is the one, true Y" is FALSE.
PGP fingerprint 655D 519C 26A7 82E7 2529  9BF0 5D8E 8BE9 F238 1AD4

Received on Wed Jul 27 2005 - 17:57:24 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:39 UTC