Re: page fault in igb driver on 8.0-RC2

From: Jack Vogel <jfvogel_at_gmail.com>
Date: Wed, 11 Nov 2009 14:19:21 -0800
On Wed, Nov 11, 2009 at 12:31 PM, Pyun YongHyeon <pyunyh_at_gmail.com> wrote:

> On Tue, Nov 10, 2009 at 03:18:09PM -0500, Mike Tancsa wrote:
> > At 02:20 PM 11/10/2009, Jack Vogel wrote:
> > >This is a fix for this problem, please apply and test this.
> >
> > Hi,
> >         Thanks! Yes, I am able to use both ports of the NIC now and
> > no panics yet. Prior to this patch, bringing up both ports resulted
> > in a non functioning NIC and panic!  Generating some UDP and tcp
> > traffic through the box, all seems to be OK on first blush.
> >
>
> I think this is a separate issue. Jack's patch surely reduce the
> number of chance of bus_dmamap_load_mbuf_sg(9) failure because it
> removed unnecessary DMA alignment restriction but once it happen
> you would get the same result. Note, original poster's machine is
> amd64.
>
> Ya, so maybe a more robust solution to that failure would be a good
thing, but this is not the time for more elaborate code, having this small
fix put in is low risk and the better failure response can come later.

In fact, I have a number of places in the ixgbe driver where I'm pondering
over what to do in failure mode anyway.

Jack


> > I will try some more extensive tests over the next little while.
> >
> > igb0: Excessive collisions = 0
> > igb0: Sequence errors = 0
> > igb0: Defer count = 0
> > igb0: Missed Packets = 0
> > igb0: Receive No Buffers = 40
> > igb0: Receive Length Errors = 0
> > igb0: Receive errors = 2
> > igb0: Crc errors = 4
> > igb0: Alignment errors = 0
> > igb0: Collision/Carrier extension errors = 0
> > igb0: RX overruns = 0
> > igb0: watchdog timeouts = 0
> > igb0: XON Rcvd = 0
> > igb0: XON Xmtd = 0
> > igb0: XOFF Rcvd = 0
> > igb0: XOFF Xmtd = 0
> > igb0: Good Packets Rcvd = 103212774
> > igb0: Good Packets Xmtd = 9347339
> > igb0: TSO Contexts Xmtd = 0
> > igb0: TSO Contexts Failed = 0
> > igb1: Excessive collisions = 0
> > igb1: Sequence errors = 0
> > igb1: Defer count = 0
> > igb1: Missed Packets = 0
> > igb1: Receive No Buffers = 0
> > igb1: Receive Length Errors = 0
> > igb1: Receive errors = 0
> > igb1: Crc errors = 0
> > igb1: Alignment errors = 0
> > igb1: Collision/Carrier extension errors = 0
> > igb1: RX overruns = 0
> > igb1: watchdog timeouts = 0
> > igb1: XON Rcvd = 0
> > igb1: XON Xmtd = 0
> > igb1: XOFF Rcvd = 0
> > igb1: XOFF Xmtd = 0
> > igb1: Good Packets Rcvd = 9365642
> > igb1: Good Packets Xmtd = 17781877
> > igb1: TSO Contexts Xmtd = 988
> > igb1: TSO Contexts Failed = 0
> >
> >
> >
> > # ./netsend 10.255.255.3 600 300 280000 10
> > Sending packet of payload size 300 every 0.000003571s for 10 seconds
> >
> > start:             1257884127.000000000
> > finish:            1257884137.000003339
> > send calls:        2800336
> > send errors:       1970
> > approx send rate:  279836
> > approx error rate: 0
> > waited:            1259257
> > approx waits/sec:  125925
> > approx wait rate:  0
> > # traceroute 10.255.255.3
> > traceroute to 10.255.255.3 (10.255.255.3), 64 hops max, 40 byte packets
> >  1  1.1.1.1 (1.1.1.1)  0.096 ms  0.073 ms  0.115 ms
> >  2  10.255.255.3 (10.255.255.3)  67.953 ms  0.297 ms  0.241 ms
> >
> > The box with the igb nics has the interfaces 1.1.1.1 and 10.255.255.1
> >
> >         ---Mike
> >
> >
> > >Jack
> > >
> > >------- if_igb.c    (revision 197079)
> > >+++ if_igb.c    (working copy)
> > >_at__at_ -2654,7 +2654,7 _at__at_
> > >     int error;
> > >
> > >     error = bus_dma_tag_create(bus_get_dma_tag(adapter->dev), /* parent
> */
> > >-                IGB_DBA_ALIGN, 0,    /* alignment, bounds */
> > >+                1, 0,            /* alignment, bounds */
> > >                 BUS_SPACE_MAXADDR,    /* lowaddr */
> > >                 BUS_SPACE_MAXADDR,    /* highaddr */
> > >                 NULL, NULL,        /* filter, filterarg */
> > >_at__at_ -2867,7 +2867,7 _at__at_
> > >      * Setup DMA descriptor areas.
> > >      */
> > >     if ((error = bus_dma_tag_create(NULL,        /* parent */
> > >-                   PAGE_SIZE, 0,        /* alignment, bounds */
> > >+                   1, 0,            /* alignment, bounds */
> > >                    BUS_SPACE_MAXADDR,    /* lowaddr */
> > >                    BUS_SPACE_MAXADDR,    /* highaddr */
> > >                    NULL, NULL,        /* filter, filterarg */
> > >_at__at_ -3554,7 +3554,7 _at__at_
> > >     ** it may not always use this.
> > >     */
> > >     if ((error = bus_dma_tag_create(NULL,        /* parent */
> > >-                   PAGE_SIZE, 0,    /* alignment, bounds */
> > >+                   1, 0,        /* alignment, bounds */
> > >                    BUS_SPACE_MAXADDR,    /* lowaddr */
> > >                    BUS_SPACE_MAXADDR,    /* highaddr */
> > >                    NULL, NULL,        /* filter, filterarg */
> > >
> > >
> > >
> > >On Tue, Nov 10, 2009 at 10:57 AM, Jack Vogel
> > ><<mailto:jfvogel_at_gmail.com>jfvogel_at_gmail.com> wrote:
> > >I have repro'd this failure this morning and think I have a fix for
> > >it, I am testing that soon.
> > >
> > >Stay tuned,
> > >
> > >Jack
> > >
> > >
> > >
> > >On Mon, Nov 9, 2009 at 6:28 PM, Mike Tancsa
> > ><<mailto:mike_at_sentex.net>mike_at_sentex.net> wrote:
> > >At 07:33 PM 11/9/2009, Jack Vogel wrote:
> > >Some reason you aren't using amd64? I will have a system installed that
> way
> > >and see if I can repro it then, thanks.
> > >
> > >
> > >
> > >I had found in the past i386 was faster for firewall and routing
> > >applications.   Perhaps thats different now, I will give it a try
> > >again to see if there is any difference.
> > >
> > >pciconf and dmesg attached.
> > >
> > >       ---Mike
> > >
> > >
> > >
> > >Jack
> > >
> > >
> > >
> > >On Mon, Nov 9, 2009 at 4:22 PM, Mike Tancsa
> > ><<mailto:mike_at_sentex.net>mike_at_sentex.net> wrote:
> > >At 05:59 PM 11/9/2009, Jack Vogel wrote:
> > >Are you using standard MTU or jumbo? That get_buf error is ENOMEM, looks
> > >like
> > >that happens when in the bus_dma stuff reserve_bounce_pages() fails.
> > >
> > >Are you maybe using a 32 bit kernel? I have not seen this failure here.
> > >
> > >
> > >Hi Jack,
> > >      Standard MTU and i386
> > >
> > >      ---Mike
> > >
> > >
> > >
> > >--------------------------------------------------------------------
> > >Mike Tancsa,                                      tel +1 519 651 3400
> > >Sentex Communications, <mailto:mike_at_sentex.net>mike_at_sentex.net
> > >Providing Internet since
> > >1994
> > ><<http://www.sentex.net>http://www.sentex.net>www.sentex.net
> > >Cambridge, Ontario
> > >Canada
> > ><<http://www.sentex.net/mike>http://www.sentex.net/mike>
> www.sentex.net/mike
> > >
> > >
> > >--------------------------------------------------------------------
> > >Mike Tancsa,                                      tel +1 519 651 3400
> > >Sentex
> > >Communications,
> > ><mailto:mike_at_sentex.net>mike_at_sentex.net
> > >Providing Internet since
> > >1994                    <http://www.sentex.net>www.sentex.net
> > >Cambridge, Ontario
> > >Canada
> > ><http://www.sentex.net/mike>www.sentex.net/mike
> > >
> > >
> >
> > --------------------------------------------------------------------
> > Mike Tancsa,                                      tel +1 519 651 3400
> > Sentex Communications,                            mike_at_sentex.net
> > Providing Internet since 1994                    www.sentex.net
> > Cambridge, Ontario Canada                         www.sentex.net/mike
> >
>
Received on Wed Nov 11 2009 - 21:19:25 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:57 UTC