On Thu, May 14, 2009 at 5:37 PM, Xin LI <delphij_at_delphij.net> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, Alexander, > > Alexander Sack wrote: > [...] >>>> Preliminary testing shows no more panics start and stopping ports >>>> under heavy load (panics were almost immediate otherwise). >>>> >>>> Thoughts? >>> I think this would solve the problem but I'm not sure whether this would >>> increase some overhead on the RX path. It seems that there is a race >>> between bge_release_resources() and bge_intr(), I mean, it might be a >>> good idea to "drain" bge_intr() instead? >> >> Are you talking about detach time? Because bge_stop() gets called >> before bge_release_resources() and stops host interrupts so where is >> the race again? I mean at this point no more interrupts should be >> delivered to bge_intr() (I can confirm from spec since BGE has >> released it in the wild). So why would you "drain" it at this >> point....(the hardware is down including the firmware). >> >> I agree it adds a little overhead to the standard bge_rxeof() path >> which I agree is very sensitive to change. However, I think the check >> at top is tolerable since the other recourse is crash. I mean its >> very easy to reproduce. Flood a Broadcom card with traffic then stop >> the card and let the race begin...it will go down in bge_rxeof() after >> bge_stop releases the lock. >> >> I actually did not look at changing anything structurally to perhaps >> make this whole predicament better but minimally there should be a >> shield against this no? > > I'm all for a workaround at this point, but I really want to make sure > that we are doing things in a safer manner if we can reach it timely. > > What about something like this, in my opinion, because bge_rxeof() would > unlock sc in the middle, it would be probably better to re-check the > IFF_DRV_RUNNING flag and avoid calling txeof after that (I've moved the > condition to after we re-grab the lock to match the rest of the code: > > ===== > [delphij_at_charlie] /sys/dev/bge> svn diff -x -p > Index: if_bge.c > =================================================================== > - --- if_bge.c (revision 191995) > +++ if_bge.c (working copy) > _at__at_ -3073,7 +3073,7 _at__at_ bge_rxeof(struct bge_softc *sc) > bus_dmamap_sync(sc->bge_cdata.bge_rx_jumbo_ring_tag, > sc->bge_cdata.bge_rx_jumbo_ring_map, BUS_DMASYNC_POSTREAD); > > - - while(sc->bge_rx_saved_considx != > + while (sc->bge_rx_saved_considx != > sc->bge_ldata.bge_status_block->bge_idx[0].bge_rx_prod_idx) { > struct bge_rx_bd *cur_rx; > uint32_t rxidx; > _at__at_ -3193,6 +3193,9 _at__at_ bge_rxeof(struct bge_softc *sc) > BGE_UNLOCK(sc); > (*ifp->if_input)(ifp, m); > BGE_LOCK(sc); > + > + if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) > + return; > } Xin this looks fine by me, I actually put this up in the while loop as I mentioned before which I think is functionality equivalent (can you gain some optimizations by putting in the while loop though compiler wise than a separate compilation unit?). > if (stdcnt > 0) > _at__at_ -3301,6 +3304,10 _at__at_ bge_poll(struct ifnet *ifp, enum poll_cmd cmd, int > > sc->rxcycles = count; > bge_rxeof(sc); > + if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) { > + BGE_UNLOCK(sc); > + return; > + } So here if bge_rxeof() drops the lock, bge_stop() sneaks in reset flag and doesn't bother to process txeof(). I can buy that. > bge_txeof(sc); > if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) > bge_start_locked(ifp); > _at__at_ -3370,7 +3377,9 _at__at_ bge_intr(void *xsc) > if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > /* Check RX return ring producer/consumer. */ > bge_rxeof(sc); > + } > > + if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > /* Check TX ring producer/consumer. */ > bge_txeof(sc); > } Same here, avoid txeof() if bge_rxeof()/bge_stop() reset the DRV_RUNNING flag. Luckily txeof() doesn't drop the lock so no check is needed in there. I'm happier....should have thought about poll though (I hate polling). -apsReceived on Thu May 14 2009 - 20:05:29 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:47 UTC