Re: Shutdown errors and timeout

From: Johan Hendriks <joh.hendriks_at_gmail.com>
Date: Mon, 16 Nov 2020 19:16:51 +0100
On 14/11/2020 13:03, Mateusz Piotrowski wrote:
> Hi,
>
> On 11/14/20 1:19 AM, Tomoaki AOKI wrote:
>> On Fri, 13 Nov 2020 20:04:59 +0900 (JST)
>> Yasuhiro KIMURA <yasu_at_utahime.org> wrote:
>>
>>> From: Johan Hendriks <joh.hendriks_at_gmail.com>
>>>
>>>> Hello all, i have two FreeBSD 13 machines, one is a bare metal and one
>>>> is virtualbox machine which i both update about once a week.
>>>>
>>>> The vritual machine seems to fail stopping something and gives a
>>>> timeout after 90 sec.
>>>>
>>>> The console ends with
>>>>
>>>> Writing entropy file: .
>>>> Writing early boot entropy file: .
>>>>
>>>> 90 second watchdog timeout expired. Shutdown terminated.
>>>> Fri Nov13 11:20:40 CEST 2020
>>>> Nov 13 11:20:40 test-head init[1]: /etc/rc.shutdown terminated
>>>> abnormally, going to single user mode
>>>> ...
>>>>
>>>> On the bare metal machine i see the following.
>>>> Writing entropy file: .
>>>> Writing early boot entropy file: .
>>>> cannot unmount '/var/run': umount failed
>>>> cannot unmount '/var/log': umount failed
>>>> cannot unmount '/var': umount failed
>>>> cannot unmount '/usr/home': umount failed
>>>> cannot unmount '/usr': umount failed
>>>> cannot unmount '/': umount failed
>>>>
>>> (snip)
>>>> The pools have not been upgraded after the latest openzfs import,
>>>> maybe that is related?
>>>>
>>>> FreeBSD test-freebsd-head 13.0-CURRENT FreeBSD 13.0-CURRENT #2
>>>> r367585:
>>>>
>>>> First thing i noticed is about a week ago.
>>> I'm facing same problem with 13.0-CURRENT amd64 r367487 and
>>> virtualbox. In my case I use autofs to mount remote file system of
>>> 12.2-RELEASE amd64 server with NFSv4. When there is still filesystem
>>> mounted by autofs, then watchdog timeout happens while shutdown. The
>>> watchdog timeout can be worked around by executing `automount -fu`
>>> before shutting down. But 'cannot unmount ...' error messages are
>>> still displayed.
>>>
>>> I added 'rc_debug="YES"' to /etc/rc.conf and checked which rc script
>>> causes this message. Then it is displayed when following `zfs_stop`
>>> function of /etc/rc.d/zfs is executed.
>>>
>>> ----------------------------------------------------------------------
>>> zfs_stop_main()
>>> {
>>>     zfs unshare -a
>>>     zfs unmount -a
>>> }
>>> ----------------------------------------------------------------------
>>>
>>> At this point syslog process still running and it opens some files
>>> under /var/log. So it make sence that `zfs unmount -a` results in the
>>> message.
>>>
>>> Probably order of executing each rc script in shutdown time should be
>>> changed so `/etc/rc.d/zfs faststop` is executed after all processes
>>> other than `init' are exited.
>> This happens on stable/12, too.
>> As a workaround, reverting r367291 on head (r367546 on stable/12)
>> would stop the issue until this is really fixed.
>>
>> If you have shared dataset or jail(s) mounting dataset, the workaround
>> would be discouraged. Read commit message for detail.
>
> I've committed r367291 and r367546.
>
> I am not sure if I can think of a proper fix for the described issues, 
> so I guess the best idea would be to revert those changes for now 
> until we figure out how to do it properly.
>
> Sorry for the regression.
>
> Best,
>
> Mateusz


I can tell that reverting the mentioned commit i do not have the 
symptoms when i reboot my servers.
Thank you all for your time, and no sorry needed ;-)

regards,
Johan
Received on Mon Nov 16 2020 - 17:16:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:25 UTC