Re: Build failed in Jenkins: FreeBSD_HEAD-tests2 #854

From: Konstantin Belousov <kostikbel_at_gmail.com> Date: Fri, 20 Mar 2015 14:21:26 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:56 UTC

On Fri, Mar 20, 2015 at 10:52:06AM +0100, Mateusz Guzik wrote:
> On Fri, Mar 20, 2015 at 11:34:51AM +0200, Konstantin Belousov wrote:
> > On Fri, Mar 20, 2015 at 03:20:26AM +0100, Mateusz Guzik wrote:
> > > On Fri, Mar 20, 2015 at 02:08:23AM +0000, jenkins-admin_at_freebsd.org wrote:
> > > > lib/libc/sys/setrlimit_test:setrlimit_nproc  ->  maxproc limit exceeded by uid 977 (pid 29170); see tuning(7) and login.conf(5)
> > > > passed  [0.551s]
> > > > lib/libc/sys/setrlimit_test:setrlimit_perm  ->  panic: mutex process lock not owned at /builds/FreeBSD_HEAD/sys/kern/kern_prot.c:1974
> > > > cpuid = 1
> > > > KDB: stack backtrace:
> > > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe009749a8e0
> > > > vpanic() at vpanic+0x189/frame 0xfffffe009749a960
> > > > panic() at panic+0x43/frame 0xfffffe009749a9c0
> > > > __mtx_assert() at __mtx_assert+0xc2/frame 0xfffffe009749a9d0
> > > > proc_set_cred() at proc_set_cred+0x36/frame 0xfffffe009749a9f0
> > > > fork1() at fork1+0x27e/frame 0xfffffe009749aac0
> > > > sys_fork() at sys_fork+0x1f/frame 0xfffffe009749aae0
> > > > amd64_syscall() at amd64_syscall+0x27f/frame 0xfffffe009749abf0
> > > > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe009749abf0
> > > > --- syscall (0, FreeBSD ELF64, nosys), rip = 0x8019c216a, rsp = 0x7fffffffc2d8, rbp = 0x7fffffffc340 ---
> > > > KDB: enter: panic
> > > > [ thread pid 660 tid 100065 ]
> > > > Stopped at      kdb_enter+0x3e: movq    $0,kdb_why
> > > 
> > > Weird, I'll look at that.
> > 
> > This is due to p_ucred not initialized on allocation of struc proc.
> > The member is not in p_startzero/p_endzero region, so it contains
> > garbage at the stage of the fork where proc_set_cred() is called,
> > while the function makes assertion based on the p_ucred content.
> 
> Yes I figured, but proc_init left me quite puzzled:
> 
> static int
> proc_init(void *mem, int size, int flags)
> {
> 	struct proc *p;
> 
> 	p = (struct proc *)mem;
> 	SDT_PROBE(proc, kernel, init, entry, p, size, flags, 0, 0);
> 	p->p_sched = (struct p_sched *)&p[1];
> 	bzero(&p->p_mtx, sizeof(struct mtx));
> 	mtx_init(&p->p_mtx, "process lock", NULL, MTX_DEF | MTX_DUPOK);
> 
> We bzero only the first mutex to make sure mtx_init works?
> 
> 	mtx_init(&p->p_slock, "process slock", NULL, MTX_SPIN);
> 	mtx_init(&p->p_statmtx, "pstatl", NULL, MTX_SPIN);
> 	mtx_init(&p->p_itimmtx, "pitiml", NULL, MTX_SPIN);
> 	mtx_init(&p->p_profmtx, "pprofl", NULL, MTX_SPIN);
> 
> How about these?
> 
> So the trivial fix would be to move p_ucred or explicitely NULL it here.
> 
> All these mtx_init calls would use MTX_NEW flag and bzero could be
> eliminated.
It is self-contradicional to initialize p_ucred to pacify assertion,
while removing bzero to avoid doing extra work to silence too eager
assert.

> 
> I'll commit a quick fix shortly.
> 
> I'm really confused what's the purpose of checking for double
> initialisation of locks. I'm not aware of any actual bug caught by this,
> on the other hand I'm aware of quite a few cases where bzero or M_ZERO
> were used just to make sure mtx_init passes.
There were bugs catched by this.

> 
> So preferably I would just get rid of such a check and effectively make
> the behaviour as if MTX_NEW is always used.