On Sep 2, 2008, at 2:44 PM, Kostik Belousov wrote: > On Tue, Sep 02, 2008 at 02:41:08PM -0400, Adam Jacob Muller wrote: >> >> On Sep 1, 2008, at 10:53 AM, Kostik Belousov wrote: >> >>> On Mon, Sep 01, 2008 at 05:33:37PM +0300, Vyacheslav Bocharov wrote: >>>> I have similar problem in 7-STABLE (from 1 sep): >>>> 32bit application exec 64application and we have an core dump: >>>> >>>> # gdb fw.sh fw.sh.core >>>> GNU gdb 6.1.1 [FreeBSD] >>>> Copyright 2004 Free Software Foundation, Inc. >>>> GDB is free software, covered by the GNU General Public License, >>>> and you are >>>> welcome to change it and/or distribute copies of it under certain >>>> conditions. >>>> Type "show copying" to see the conditions. >>>> There is absolutely no warranty for GDB. Type "show warranty" for >>>> details. >>>> This GDB was configured as "amd64-marcel-freebsd"... >>>> Core was generated by `fw.sh'. >>>> Program terminated with signal 11, Segmentation fault. >>>> Reading symbols from /usr/lib/libstdc++.so.6...done. >>>> Loaded symbols for /usr/lib/libstdc++.so.6 >>>> Reading symbols from /lib/libm.so.5...done. >>>> Loaded symbols for /lib/libm.so.5 >>>> Reading symbols from /lib/libgcc_s.so.1...done. >>>> Loaded symbols for /lib/libgcc_s.so.1 >>>> Reading symbols from /lib/libc.so.7...done. >>>> Loaded symbols for /lib/libc.so.7 >>>> Reading symbols from /libexec/ld-elf.so.1...done. >>>> Loaded symbols for /libexec/ld-elf.so.1 >>>> #0 0x0000000800507483 in __tls_get_addr () from /libexec/ld- >>>> elf.so.1 >>>> (gdb) bt >>>> #0 0x0000000800507483 in __tls_get_addr () from /libexec/ld- >>>> elf.so.1 >>>> #1 0x0000000800ad8892 in _pthread_mutex_init_calloc_cb () from >>>> /lib/libc.so.7 >>>> #2 0x0000000800ada35f in malloc () from /lib/libc.so.7 >>>> #3 0x00000008007050ad in operator new () from /usr/lib/libstdc+ >>>> +.so.6 >>>> #4 0x00000008006b5f21 in std::string::_Rep::_S_create () >>>> from /usr/lib/libstdc++.so.6 >>>> #5 0x00000008006b6ca5 in std::string::_S_copy_chars () >>>> from /usr/lib/libstdc++.so.6 >>>> #6 0x00000008006b6dc2 in std::basic_string<char, >>>> std::char_traits<char>, >>>> std::allocator<char> >::basic_string () from /usr/lib/libstdc+ >>>> +.so.6 >>>> #7 0x00000000004021ec in >>>> __static_initialization_and_destruction_0 ( >>>> __initialize_p=1, __priority=65535) at CCmdLine.cpp:16 >>>> #8 0x00000000004026c3 in global constructors keyed to cmdlist () >>>> at CCmdLine.cpp:177 >>>> #9 0x00000000004033a2 in __do_global_ctors_aux () >>>> #10 0x000000000040113e in _init () >>>> #11 0x0000000800b2b0c0 in __cxa_atexit () from /lib/libc.so.7 >>>> #12 0x00000000004014e8 in _start () >>>> #13 0x000000080052c000 in ?? () >>>> >>>> I tried your patch but nothing changed. >>> Exactly which patch ? There were three, one of which caused >>> immediate >>> panic. I put the patches at >>> http://people.freebsd.org/~kib/misc/fsbase.1.patch >>> http://people.freebsd.org/~kib/misc/fsbase.2.patch >>> >>> Could you, please, try both and report the results ? >>> And, isolated test case, as several C files or recipe to reproduce >>> this with base system, would be ideal. >>> >>>> >>>> 2008/8/31 Kostik Belousov <kostikbel_at_gmail.com> >>>> >>>>> On Sun, Aug 31, 2008 at 10:16:18AM +0300, Kostik Belousov wrote: >>>>>> On Sat, Aug 30, 2008 at 02:03:00PM -0700, Artem Belevich wrote: >>>>>>> With the new patch kernel has crashed as soon as I ran i386 app, >>>>>>> though the crash happened within in-kernel thread g_up: >>>>>>> >>>>>>> Fatal trap 12: page fault while in kernel mode >>>>>>> cpuid = 2; apic id = 02 >>>>>>> fault virtual address = 0x20 >>>>>>> fault code = supervisor read data, page not present >>>>>>> instruction pointer = 0x8:0xffffffff804a821f >>>>>>> stack pointer = 0x10:0xffffffffac280b60 >>>>>>> frame pointer = 0x10:0x0 >>>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b >>>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1 >>>>>>> processor eflags = resume, IOPL = 0 >>>>>>> current process = 3 (g_up) >>>>>>> trap number = 12 >>>>>>> panic: page fault >>>>>>> cpuid = 2 >>>>>>> Uptime: 37s >>>>>>> Physical memory: 8169 MB >>>>>>> Dumping 380 MB: 365 349 333 317 301 285 269 253 237 221 205 189 >>>>>>> 173 >>>>>>> 157 141 125 109 93 77 61 45 29 13 >>>>>> Could you, please, show me the disassembled code around the >>>>>> faulted >>>>>> %rip ? >>>>> >>>>> No need, it seems I found the problem. I trashed the %rdx that >>>>> contains >>>>> the third cpu_switch argument. Please, try the updated patch. >>>>> >>>>> Thanks for the testing ! >>>>> >>>>> diff --git a/sys/amd64/amd64/cpu_switch.S b/sys/amd64/amd64/ >>>>> cpu_switch.S >>>>> index f34b0cc..03f0eca 100644 >>>>> --- a/sys/amd64/amd64/cpu_switch.S >>>>> +++ b/sys/amd64/amd64/cpu_switch.S >>>>> _at__at_ -249,6 +249,12 _at__at_ store_seg: >>>>> 1: movl %ds,PCB_DS(%r8) >>>>> movl %es,PCB_ES(%r8) >>>>> movl %fs,PCB_FS(%r8) >>>>> + movq %rdx,%r11 >>>>> + movl $MSR_FSBASE,%ecx >>>>> + rdmsr >>>>> + shlq $32,%rdx >>>>> + leaq (%rax,%rdx),%r9 >>>>> + movq %r11,%rdx >>>>> jmp done_store_seg >>>>> 2: movq PCB_GS32P(%r8),%rax >>>>> movq (%rax),%rax >>>>> >>>> >>>> >>>> >>>> -- >>>> Vyacheslav Bocharov >> >> >> >> Hi, >> i have this same issue on recent RELENG_7 (pre and post 7.1- >> PRERELEASE), the issue was reproducible by a simple c-app compiled on >> 7.x 32-bit >> >> #include <unistd.h> >> main() >> { >> execl("/bin/ls", "/bin/ls", (char *) 0); >> } >> >> this app will segfault rather reliably (but not 100% of the time) >> (while true;do ./test; if [ "$?" -gt "0" ];then break; fi; done). >> >> patch 1 (http://people.freebsd.org/~kib/misc/fsbase.1.patch) fixes >> the >> issue for me >> patch 2 (http://people.freebsd.org/~kib/misc/fsbase.2.patch) does not >> though it may mitigate it slightly (cause things to crash less >> frequently) > > Patch below was committed to current, it shall address your issue. > > diff --git a/sys/amd64/amd64/cpu_switch.S b/sys/amd64/amd64/ > cpu_switch.S > index f34b0cc..a0b11f8 100644 > --- a/sys/amd64/amd64/cpu_switch.S > +++ b/sys/amd64/amd64/cpu_switch.S > _at__at_ -109,8 +109,24 _at__at_ ENTRY(cpu_switch) > movq %rsp,PCB_RSP(%r8) > movq %rbx,PCB_RBX(%r8) > movq %rax,PCB_RIP(%r8) > - movq PCB_FSBASE(%r8),%r9 > - movq PCB_GSBASE(%r8),%r10 > + > + /* > + * Reread fs and gs bases. Explicit fs segment register load > + * by the usermode code may change actual fs base without > + * updating pcb_{fs,gs}base. > + * > + * %rdx still contains the mtx, save %rdx around rdmsr. > + */ > + movq %rdx,%r11 > + movl $MSR_FSBASE,%ecx > + rdmsr > + shlq $32,%rdx > + leaq (%rax,%rdx),%r9 > + movl $MSR_KGSBASE,%ecx > + rdmsr > + shlq $32,%rdx > + leaq (%rax,%rdx),%r10 > + movq %r11,%rdx > > testl $PCB_32BIT,PCB_FLAGS(%r8) > jnz store_seg > diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c > index 06c0803..f3c41f7 100644 > --- a/sys/amd64/amd64/machdep.c > +++ b/sys/amd64/amd64/machdep.c > _at__at_ -734,6 +734,7 _at__at_ exec_setregs(td, entry, stack, ps_strings) > pcb->pcb_fsbase = 0; > pcb->pcb_gsbase = 0; > critical_exit(); > + pcb->pcb_flags &= ~(PCB_32BIT | PCB_GS32BIT); > load_ds(_udatasel); > load_es(_udatasel); > load_fs(_udatasel); > diff --git a/sys/amd64/ia32/ia32_signal.c b/sys/amd64/ia32/ > ia32_signal.c > index 9e98656..162dcf9 100644 > --- a/sys/amd64/ia32/ia32_signal.c > +++ b/sys/amd64/ia32/ia32_signal.c > _at__at_ -742,5 +742,6 _at__at_ ia32_setregs(td, entry, stack, ps_strings) > > /* Return via doreti so that we can change to a different %cs */ > pcb->pcb_flags |= PCB_FULLCTX | PCB_32BIT; > + pcb->pcb_flags &= ~PCB_GS32BIT; > td->td_retval[1] = 0; > } This will be MFC'd into 7.1 before release? -AdamReceived on Tue Sep 02 2008 - 17:39:38 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:34 UTC