On Sun, Nov 11, 2018 at 08:44:24PM +0100, Guido Falsi wrote: > On 11/11/18 11:10, Guido Falsi wrote: > > On 11/11/18 00:07, Konstantin Belousov wrote: > >> On Sat, Nov 10, 2018 at 05:27:09PM +0100, Guido Falsi wrote: > >>> On 10/11/18 13:08, Guido Falsi wrote: > >>>> I'll to bisect things, but it will be a slow process. > >>> > >>> I narrowed it down to r339895. > >> I somehow doubt that this is the case. > >> > > > > I did not mean to accuse you. Instead thanks for this reply and the > > suggestions. Really appreciated. > > > > I simply found out that removing that commit from my sources gives me a > > stable system and reported such finding. > > > > I understand that the actual cause could be an interaction with other > > code and am ready to review my findings. > > > >> If you take post-r339895 kernel and start e.g. 11.2-RELEASE userspace > >> (untar the installation into jail to avoid reinstallation), does it > >> still demonstrate the behaviour ? > >> > >> Also try to run pre-r339895 with the 12.0 userspace from e.g. 12.0-BETA4 > >> builds. > > > > I'll perform such tests. Please allow me some time to report back what I > > get. > > I performed these tests. I downloaded the 12.0-BETA4 and 11.2 > installation images and replaced the kernels in there. This was faster > than working with jails on a crippled system. > > r339895 kernel on 11.2-RELEASE causes fsck (launched by rc) to dump core > and this stops the boot procedure. > > r339894 kernel on 12.0-BETA4 works fine. Ok, let try to find some reason. - When you build your kernels, you do not use any cpu-specific optimization flags, do you ? More, you follow the standard build procedure and your make.conf and src.conf are empty, right ? - Do you preload a microcode update from the loader ? - Show the output of sysctl vm.pmap. - Show verbose dmesg from the boot of the problematic kernel. You posted non-verbose dmesg for 12.0-BETA4. - Enter ddb, when booted the problematic kernel. Do db> x/x cpu_stdext_feature db> x/x cpu_stdext_feature+4 - From the same ddb session, disassemble e.g. cpu_set_user_tls(). You could paste me whole disassembling, but really I want to know the single line with the call to set_pcb_flagsXXXX, it should be either set_pcb_flags_raw or set_pcb_flags_fsgsbase. To disassemble in ddb, do db> x/i cpu_set_user_tls and then press <enter> more to get next and next instructions. (I want the disassembly from ddb and not from gdb/kgdb). - Try the following patch. diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c index 6e36ae97523..8dafd4b4756 100644 --- a/sys/amd64/amd64/machdep.c +++ b/sys/amd64/amd64/machdep.c _at__at_ -2627,8 +2627,8 _at__at_ set_pcb_flags_raw(struct pcb *pcb, const u_int flags) * the PCB_FULL_IRET flag is set. We disable interrupts to sync with * context switches. */ -static void -set_pcb_flags_fsgsbase(struct pcb *pcb, const u_int flags) +void +set_pcb_flags(struct pcb *pcb, const u_int flags) { register_t r; _at__at_ -2649,13 +2649,6 _at__at_ set_pcb_flags_fsgsbase(struct pcb *pcb, const u_int flags) } } -DEFINE_IFUNC(, void, set_pcb_flags, (struct pcb *, const u_int), static) -{ - - return ((cpu_stdext_feature & CPUID_STDEXT_FSGSBASE) != 0 ? - set_pcb_flags_fsgsbase : set_pcb_flags_raw); -} - void clear_pcb_flags(struct pcb *pcb, const u_int flags) {Received on Sun Nov 11 2018 - 20:14:50 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:19 UTC