Re: SIGSEGV in /bin/sh after r322740 -> r322776 update

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Tue, 22 Aug 2017 18:34:42 +0300
On Tue, Aug 22, 2017 at 08:17:38AM -0700, David Wolfskill wrote:
> On Tue, Aug 22, 2017 at 04:19:58PM +0300, Konstantin Belousov wrote:
> > ...
> > > > Ok, can you rebuild kernel and libc from scratch ?  I.e. remove your
> > > > object directories.
> > > 
> > > I think I'll need a working /bin/sh to do that.  As noted, I could
> > > try the stable/11 /bin/sh; on the other hand, if it's dying in a
> > > library, that's not likely to help a whole lot. :-}
> > I highly suspect that this is not /bin/sh at all.  Backtrace strongly
> > suggests that the malloc() has issues, but again I suspect that the
> > reason is not an issue in malloc, but its use of TLS.
> > 
> > The amd64 changes were to the TLS base register handling.  So you might
> > try to boot previous kernel.  If this works out without replacing libc
> > then it is definitely TLS, but I still do not know what is wrong.
> > 
> > > 
> > > But yes: once we resolve the "working /bin/sh" issue, clearing
> > > /usr/obj & rebuilding is straighforward and shouldn't take too long.
> > ....
> 
> OK.  Booting from the previous kernel (/boot/kernel.old) allowed /bin/sh
> (et al.) to work without segfaults, so after clearing /usr/obj, I
> rebuilt r322776 from scratch (yes, userland as well as kernel).
> 
> On reboot, I wtached the serial console, and noted:
> 
> ...
> Mounting local filesystems:.
> ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/perl5/5.24/mach/CORE
> 32-bit compatibility ldconfig path: /usr/lib32 /usr/lib32/compat
> Setting hostname: freebeast.catwhisker.org.
> Setting up harvesting: [UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
> Feeding entropy: .
> Starting Network: lo0 re0.
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>         options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_pIPV6>
>         inet6 ::id 298 (sh), uid 0: exited on signal 11 prefixlen 128 1 (core dumped)
> 
>         inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 
>         inet 127.0.0.1 netmask 0xff000000 
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         groups: lo 
> re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTpICAST> metric 0 id 305 (sh), uid 0: exited on signal 11 (core dumped)
> mtu 1500
>         options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
>         ether 98:90:96:d6:c9:6d
>         inet 172.16.8.10 netmask 0xffffff00 pid 310 (sh), uid 0: exited on signal 11 (core dumped)
> broadcast 172.16.8.255 
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (none)
>         status: no re0: link state changed to UP
> carrier
> Segmentation fault (core dumped)
> Startpid 314 (sh), uid 0: exited on signal 11 (core dumped)
> ing devd.
> Segmentation fault (core dumped)
> Segmentation fault (core dumped)
> Segmentation fault (core dumped)
> pid 319 (sh), uid 0: exited on signal 11 (core dumped)
> Segmentation fault (core dumped)
> pid 330 (sh), uid 0: exited on signal 11 (core dumped)
> Segmentation fault (core dumped)ubt0 on uhub2
> ubt0: <Broadcom Corp BCM43142A0, rev 2.00/1.12, addr 3> on usbus0
> 
> random: harvesting attach, 8 bytes (4 bits) from ubt0
> pid 339 (sh), uid 0: exited on signal 11 (core dumped)
> Segmentation fault (core dumped)
> pid 343 (sh), uid 0: exited on signal 11 (core dumped)
> Segmentation fault (core dumped)WARNING: attempt to domain_add(bluetooth) after domainfinalize()
> 
> WARNING: attempt to domain_add(netgraph) after domainfinalize()
> add host 127.0.0.1: gateway lo0 fib 0: route already in table
> add net default: gateway 172.16.8.1
> add host ::1: gateway lo0 fib 0: route already in table
> add net fe80::: gateway ::1
> add net ff02::: gateway ::1
> add net ::ffff:0.0.0.0: gateway ::1
> add net ::0.0.0.0: gateway ::1
> Creating and/or trimming log files.
> Starting syslogd.
> Starting rpcbind.
> NFS access cache time=60
> No core dumps found.
> Setting NIS domain: lmdhw.com.
> Starting ypbind.
> Clearing /tmp (X related).
> Starting mountd.
> NFSv4 is disabled
> Starting nfsd.
> Starting statd.
> Starting lockd.
> Recovering vi editor sessions:.
> Starting lpd.
> Upda
> FreeBSD/amd64 (freebeast.catwhisker.org) (ttyu0)
> 
> login: 
> [end of console output -- dhw]
> 
> 
> So ... looks as if we still have at least one issue, and we have a way
> to evade the segfaults.
> 
> Bisection time?  Or if there's another approach (or even a suggestion
> for a revision to try first), I'm up for it.  9And yes, I'll just
> be rebuilding the kernel for the rest of this exercise, I think.
> That should speed things up significantly.)

No need.  It is clearly something with r322762 (more likely) or
r322763 (less likely).

Give me some time, I either fix it today or revert the commits.
Received on Tue Aug 22 2017 - 13:34:54 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC