"Steven Hartland" <killing_at_multiplay.co.uk> wrote: > > "Steven Hartland" <killing_at_multiplay.co.uk> wrote: > > > > > From: "Fabian Keil" <freebsd-listen_at_fabiankeil.de> > > > > > > > After updating my laptop to yesterday's CURRENT (r265216), > > > > I got the following fatal double fault on boot: > > > > http://www.fabiankeil.de/bilder/freebsd/kernel-panic-r265216/ > > > > > > > > My previous kernel was based on r264721. > > > > > > > > I'm using a couple of custom patches, some of them are ZFS-related > > > > and thus may be part of the problem (but worked fine for months). > > > > I'll try to reproduce the panic without the patches tomorrow. > > > > > > > > > > Your seeing a stack overflow in the new ZFS queuing code, which I > > > believe is being triggered by lack of support for TRIM in one of > > > your devices, something Xin reported to me yesterday. > > > > > > I commited a fix for failing TRIM requests processing slowly last > > > night so you could try updating to after r265253 and see if that > > > helps. > > > > Thanks. The hard disk is indeed unlikely to support TRIM requests, > > but I can still reproduce the problem with a kernel based on r265255. > > Thanks for testing, I suspect its still a numbers game with how many items > are outstanding in the queue and now that free / TRIM requests are also > now queued its triggering the failure. > > If your just on a HDD try setting the following in /boot/loader.conf as > a temporary workaround: > vfs.zfs.trim.enabled=0 That worked, thanks. > > > I still need to investigate the stack overflow more directly which > > > appears to be caused by the new zfs queuing code when things are > > > running slowly and there's a large backlog of IO's. > > > > > > I would be interested to know you config there so zpool layout and > > > hardware in the mean time. > > > > The system is a Lenovo ThinkPad R500: > > http://www.nycbug.org/index.cgi?action=dmesgd&do=view&dmesgid=2449 > > > > I'm booting from UFS, the panic occurs while the pool is being imported. > > > > The pool is located on a single geli-encrypted slice: > > > > fk_at_r500 ~ $zpool status tank > > pool: tank > > state: ONLINE > > scan: scrub repaired 0 in 4h11m with 0 errors on Sat Mar 22 18:25:01 2014 > > config: > > > > NAME STATE READ WRITE CKSUM > > tank ONLINE 0 0 0 > > ada0s1d.eli ONLINE 0 0 0 > > > > errors: No known data errors > > > > Maybe geli fails TRIM requests differently. > > That helps, Xin also reported the issue with geli and thats what I'm testing > with, I believe this is a factor because is significantly slows things down > again meaning more items in the queues, but I've only managed to trigger it > once here as the machine I'm using is pretty quick. It probably doesn't make a difference, but my system is rather old and thus I'm still using geli version 3 for ada0s1d.eli while geli init nowadays defaults to geli version 7. The system certainly is also slow, though. Fabian
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:48 UTC