Re: panic: Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed at /usr/src/sys/kern/sched_ule.c:2137

From: bob prohaska <fbsd_at_www.zefox.net>
Date: Wed, 6 Jun 2018 16:58:01 -0700
On Wed, Jun 06, 2018 at 08:55:39PM +0200, Ronald Klop wrote:
> On Sat, 02 Jun 2018 13:40:27 +0200, Ronald Klop <ronald-lists_at_klop.ws>  
> wrote:
> 
> 
> How do you ever run a -j4 buildworld? My RPI3 starts building clang/llvm  
> with sometimes 500 MB+ per process so everything starts swapping like hell  
> and takes forever to run.
> 

Lately, never 8-)

When I started playing with an RPI3, in late 2016, -j4 buildworlds 
worked usably well.  Early in 2018 problems appeared, including  
Assertion td->td_lock == TDQ_LOCKPTR(tdq) failed, among others.

Things didn't really go to pot until somewhat later when the swap frenzy 
issue reared its head and haven't improved much.

Sadly, when the swap frenzy workaround of using
sysctl vm.pageout_update_period=0 was suggested,
a -j4 buildworld then resorted to the old td_lock
issue, so it looks as if both bugs are alive and
kicking.

Just to complicate matters, I was in the habit of
using a USB flash drive as both an outboard file
system (/usr/, /var/ and /tmp/) and as a swap device.
A very common reaction was to blame the flash device
for the trouble, though so far as I can tell a Sandisk
Extreme USB flash drive isn't much slower, if any, than
a mechanical hard disk for random writes. The same USB
flash devices on a Pi2 running 11-Stable seems to be fine.

However, turning off the USB flash swap device does seem
to reduce the number of "indefinite wait buffer" messages
on the console (they're usually not fatal) so I think there
is still something amiss. Whether it's the flash, the USB
or the VM system is unclear to me.

For now the workarounds are to run buildworld with no explicit
-j value (presumably equivalent to -j1), to use only swap on
the microSD card and to use the -DNO_CLEAN option for most
buildworld sessions, doing an explict "make clean" or
"rm -rf /usr/obj/usr" when necessary. In a few cases it
seemed helpful to start with "make kernel-toolchain" then
follow with make -DNO_CLEAN buildworld" but I didn't keep
good enough records to be certain of the benefits. 

Apologies for the length, HTH 

bob prohaska
Received on Wed Jun 06 2018 - 21:57:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:16 UTC