Re: r343567 aka PAE vs non-PAE merge breaks i386 freebsd

From: Rodney W. Grimes <freebsd-rwg_at_pdx.rh.CN85.dnsmgr.net>
Date: Sat, 23 Feb 2019 10:04:07 -0800 (PST)
> On Sat, Feb 23, 2019 at 11:19:31AM +0200, Konstantin Belousov wrote:
> > On Fri, Feb 22, 2019 at 07:26:44PM -0800, Steve Kargl wrote:
> > > On Thu, Feb 21, 2019 at 10:04:10PM -0800, Steve Kargl wrote:
> > > > On Thu, Feb 21, 2019 at 07:39:25PM -0800, Steve Kargl wrote:
> > > > > r343567 merges the PAE vs non-PAE pmap headers for i386
> > > > > freebsd.  After bisection and dealing with the drm-legacy-kmod
> > > > > fallout, I bisected /usr/src to r343567.  Building world and
> > > > > a GENERIC kernel and the minimum set of ports to start Xorg
> > > > > on my Dell Latitude D530 laptop, results in a black screen
> > > > > of death and a locked up laptop (no keyboard, mouse, or video).
> > > > > 
> > > > > A comparison of /etc/log/Xorg.0.log for r343566 (Xorg loads
> > > > > and functions) and r353467 (Xorg black screen of death) shows
> > > > > that /boot/modules/i915kms.ko loads correctly as the log
> > > > > files are identical.
> > > > > 
> > > > > Comparing dmesg for r343566 to r343567 shows the following
> > > > >  
> > > > > --- dmesg.343566	2019-02-20 08:13:07.727202000 -0800
> > > > > +++ dmesg.343567	2019-02-21 19:02:24.469562000 -0800
> > > > > _at__at_ -3,11 +3,11 _at__at_
> > > > >  Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> > > > >  	The Regents of the University of California. All rights reserved.
> > > > >  FreeBSD is a registered trademark of The FreeBSD Foundation.
> > > > > -FreeBSD 13.0-CURRENT r343566 GENERIC i386
> > > > > +FreeBSD 13.0-CURRENT r343567 GENERIC i386
> > > > >  FreeBSD clang version 7.0.1 (tags/RELEASE_701/final 349250) (based on LLVM 7.0.1)
> > > > >  WARNING: WITNESS option enabled, expect reduced performance.
> > > > >  VT(vga): resolution 640x480
> > > > > -CPU: Intel(R) Core(TM)2 Duo CPU     T7250  _at_ 2.00GHz (1995.05-MHz 686-class CPU)
> > > > > +CPU: Intel(R) Core(TM)2 Duo CPU     T7250  _at_ 2.00GHz (1995.04-MHz 686-class CPU)
> > > > >    Origin="GenuineIntel"  Id=0x6fd  Family=0x6  Model=0xf  Stepping=13
> > > > >    Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
> > > > >    Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
> > > > > _at__at_ -16,7 +16,7 _at__at_
> > > > >    VT-x: (disabled in BIOS) HLT,PAUSE
> > > > >    TSC: P-state invariant, performance statistics
> > > > >  real memory  = 4294967296 (4096 MB)
> > > > > -avail memory = 3639914496 (3471 MB)
> > > > > +avail memory = 4154175488 (3961 MB)
> > > > > 
> > > > > Somehow the r343567 kernel found an addition 490 MB of memory,
> > > > > which leads me to believe the after loading i915kms.ko there
> > > > > is some serious memory stomping issues.
> > > > > 
> > > > > I willing to do whatever is necessary to fix this issue (shorter
> > > > > of mailing the laptop to someone).  Is it possible to revert
> > > > > r343567 and move forward? 
> > > > > 
> > > > 
> > > > More info from sysctl.  With the "good" r343566, I see
> > > > 
> > > > vm.kmem_map_free: 1187033088
> > > > vm.kmem_map_size: 27234304
> > > > vm.kmem_size_scale: 3
> > > > vm.kmem_size_max: 1715470336
> > > > vm.kmem_size_min: 12582912
> > > > vm.kmem_zmax: 65536
> > > > vm.kmem_size: 1214267392
> > > > hw.physmem: 3714269184
> > > > hw.usermem: 3650867200
> > > > hw.realmem: 4294963200
> > > > 
> > > > With the problematic r343567, I see
> > > > 
> > > > vm.kmem_map_free: 1683152896
> > > > vm.kmem_map_size: 28123136
> > > > vm.kmem_size_scale: 1
> > > > vm.kmem_size_max: 1711276032
> > > > vm.kmem_size_min: 12582912
> > > > vm.kmem_zmax: 65536
> > > > vm.kmem_size: 1711276032
> > > > hw.physmem: 4252360704
> > > > hw.usermem: 4146999296
> > > > hw.realmem: 4294963200
> > > > 
> > > > Ideas?
> > > > 
> > > 
> > > Here's the 'diff -uw' between a verbose dmesg boot of r343566
> > > and dmesg boot of r343567.  The memory size looks rather puzzling.
> > > Can the people responsible for the i386 pmap.h merging take a
> > > look?
> > What is puzzling ?
> 
> Supposely, the laptop only has 4 GB of memory.  Not sure how
> it finds memory above 4 GB.

It probably has what is called UMA and the graphics
framebuffer is mapped into memory below 4G and the
original memory is mapped above 4G, giving you this
little bit of >4G memory that is trigger PAE now.

This may not be desired, is there any performance
advantage to not turning on PAE in this situation?

> I build 343566 and minimum ports needed for Xorg including
> drm-legacy-kmod.  I can load xorg, and in fact, I am typing
> this email now on the laptop with vi in xterm.
> 
> I build 343567 and minimum ports needed for Xorg including
> drm-legacy-kmod.  I try to start Xorg.  Black screen of death.
> No mouse.  No keyboard.  Just a hard reset.

That would be a regression caused by PAE coming into play.

> I build 343567 and minimum ports needed for Xorg including
> drm-legacy-kmod.  I load i915kms.ko, do not start Xorg.  There
> are surprising strikes/blotches of color on screen.  Building any
> port with the system's cc results in occasion segfaults. 
> 
> > When kernel boots in PAE mode, it can (and will) get a use for physical
> > memory mapped above 4G.  I highlighted the SMAP entry which represents
> > such memory, below.
> > 
> > kmem_scale was changed in the PAE commit, see the commit message for
> > explanation.
> > 
> 
> I read it multiple times.  It does not explain how to get the
> old pre-343567 behavior where the laptop is usable.  It mentions
> two new sysctl entities.  One is irrelevant as I don't have 24+ GB
> of memory.  The other has this in the commit message:
> 
>     There are two tunables added: hw.above4g_allow and ...,

I think trying to set that sysctl to 0 in a post 343567 system
is worth a try.

>     the first one is kept enabled for now to evaluate the status
>     on HEAD, ...
> 
> Well, here's a report that indicates the status is "not okay".  The
> commit message also has the afterthought:
> 
>     Also, VM_KMEM_SIZE_SCALE changed from 3 to 1.
> 
> Okay, so what does that mean.  Will setting vm.kmem_size_scale to 3
> fix what appears to be some memory corruption or mismanagement?
> 

-- 
Rod Grimes                                                 rgrimes_at_freebsd.org
Received on Sat Feb 23 2019 - 17:04:11 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:20 UTC