> On 30 Mar 2018, at 18:03, Stefan Esser <se_at_freebsd.org> wrote: > > Am 29.03.18 um 07:15 schrieb Toomas Soome: >> >> >>> On 29 Mar 2018, at 01:06, Stefan Esser <se_at_freebsd.org> wrote: >>> >>> Am 28.03.18 um 22:28 schrieb Warner Losh: >>>>> Hmmm, the code references point into the boot loader code - I had >>>>> expected that there is a problem in the kernel, not the boot loader. >>>>> >>>>>> [1] >>>>>> https://svnweb.freebsd.org/base/head/stand/libsa/sbrk.c?view=markup#l56 >>>> <https://svnweb.freebsd.org/base/head/stand/libsa/sbrk.c?view=markup#l56> >>>>> >>>>> >>>>> Seems that setbase has either not been called or has been called with >>>>> base=0. >>>> >>>> Right, which is odd... >>>> >>>>>> [2] >>>>>> https://svnweb.freebsd.org/base/head/stand/i386/zfsboot/zfsboot.c?view=markup#l688 >>>> <https://svnweb.freebsd.org/base/head/stand/i386/zfsboot/zfsboot.c?view=markup#l688> >>>>> >>>>> >>>>> I had thought, that the zfs boot code has been initialized before the >>>>> menu is displayed? >>>> >>>> Right, all of this should be done looooong before we get to the >>>> interpreter. Can you break into the loader prompt and try the `heap` >>>> command, see what that outputs? CC'ing imp_at_ because he actually knows >>>> things. >>>> >>>> Totally weird. I'd add a printf to the sethead() function to display its args >>>> and see if you get this panic before/after that printf... >>> >>> I'm currently using a Forth-enabled boot loader again, since this is a >>> "production" machine (my home server, which also receives and keeps all >>> my work email, for example). >>> >>> I'll build a clean world with the LUA loader and test it on one of the >>> next days. Tests will include the "heap" loader command and I'll add the >>> printf (though, if sbrk() has really not been called, I guess that will >>> not go too well ...). >>> >>> Is it possible, that the setheap function is called a second time, just >>> before jumping into the kernel? (In that case adding the printf might >>> crash the loader in the first setheap call ...) >>> >>> Since the loader menu (and escaping from the menu) works, there must be >>> a valid heap, at that time. >>> >> >> indeed. and assuming the message really is from loader, it means, there must >> be memory corruption - if so, you can check which variables are located >> close to heap related ones… Also, since you have the working menu, it has to >> be related to actual loading. Since the loading itself has been working so >> far, it should be related to lua specific bits which are preparing towards >> to call load functions. > > Ok, some more data points: > > 1) A printf in setheap reported plausible values during start-up of zfsboot. > The menu appeared and wiped away the values so fast that I could not take > a photo or write them down. > if you got menu and stuff, it means that at that point the heap was all OK. just after setheap() the bcache_init() is called and that too will allocate memory. what you can do is to esc out from menu to OK prompt and check the output of heap and biosmem commands… > 2) I have rebuilt world and kernel based on r331763. Booting resulted in the > same panic as reported before. There was no debug output from the patched > setheap call before the panic (which indicates that it was not called a > second time). > > 3) In order to get my system to boot, I interrupted loading of zfsloader and > forced loading of the previous version (from a world build with Forth in > the loader). Booting succeeded with the latest kernel ... > > It looks as if sbrk() was called in zfsloader before setheap() has been used > to initialize the heap parameters, if lua is enabled instead if Forth. See > stand/i386/loader/main.c:124 for the location of the setheap call in the > loader. this can only happen when something is called before main… > > This is obviously hard to debug, though, since printf cannot be called at that > point. A pure write(2) should be possible without heap, but since the console > has not been initialized at the point of the setheap invocation, there is no > working output device, AFAIK. > > I do not see, how any sbrk() call could occur before setheap is called. And > there does not appear to be any other setheap function (or macro) in the > tree, that could overload the one defined in stand/libsa/sbrk.c ... > > I have no idea how to proceed from here ... > > But now I'm sure it is a problem in zfsloader (or loader in general?). > > Hmmm: How is the panic message printed by sbrk() without a initialized heap? > The definition of panic in stand/libsa/panic.c relies on a working printf! > > I should be able to use printf in the same way as panic does, but I did > not succeed when I tried to use it early in zfsloader ... > > Regards, STefan rgds, toomasReceived on Fri Mar 30 2018 - 16:10:51 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC