Hi Andriy, On 08/31/15 18:46, Andriy Gapon wrote: > On 30/08/2015 22:09, Andriy Gapon wrote: >> On 30/08/2015 19:16, Konstantin Belousov wrote: >>> This is strange, I do not think that could be a right explanation of this >>> issue. The taskqueue callout is initialized with the mutex, which means >>> that the callout_stop() caller >>> - must own the mutex; >>> - is synchronous with the callout. >>> In other words, callout cannot be running when taskqueue_cancel_timeout() >>> calls callout_stop(), it only can be dequeued but the callout function >>> is not yet called. If callout_stop() is performed meantime, between >>> dropping callout_cpu lock, and locking the mutex, it must be not run. >> >> Thank you for the explanation. I am not familiar with the code and I >> misinterpreted the manual page and thought that callout_stop() might be unable >> to stop the callout even if it was initialized with the mutex. I see my mistake >> now. > > I had to look at the code, of course. > Could the following logic actually be buggy? > > int > callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, > void (*ftn)(void *), void *arg, int cpu, int flags) > { > ... > if (cc_exec_curr(cc, direct) == c) { > /* > * We're being asked to reschedule a callout which is > * currently in progress. If there is a lock then we > * can cancel the callout if it has not really started. > */ > if (c->c_lock != NULL && cc_exec_cancel(cc, direct)) > cancelled = cc_exec_cancel(cc, direct) = true; > > I have a suspicion that the condition should check for !cc_exec_cancel: > - it seems that cc_exec_cancel is set to true when the callback execution starts > - logically, it wouldn't make sense to check if X is true and then set it to true > > The code is quite hard to understand in a short time, so I could be > misinterpreting what's going on, but I'd like to check my suspicion. While touching this topic, is the following code more readable or not (see the callout_restart_async function): https://svnweb.freebsd.org/base/projects/hps_head/sys/kern/kern_timeout.c?revision=287248&view=markup > > If this is a bug indeed, then it might make the following scenario possible: > > - softclock() is being executed in thread T1 and the control flow is in > softclock_call_cc() after CC_UNLOCK() and before lc_lock(c_lock) > - thread T2 is executing callout_reset_sbt_on on the same callout as T1 > - so, I think, T2 will see cc_exec_cancel == false and thus, because of the > suspected bug, cc_exec_cancel will remain false > - T2 will proceed to callout_cc_add() and will happily (re-)schedule the callout > - after T2 releases the callout lock, T1 will continue in softclock_call_cc() > and because cc_exec_cancel is false the callout's callback function will get called Correct. > > In effect, there will be an extra call of the callout. > If that's so that could explain the triggered assertion in taskqueue_timeout_func(). > > P.S. I am going to add an assertion in callout_cc_add that CALLOUT_ACTIVE is not > set and I will see if I can get anything interesting from it. > > P.P.S. On a further look, it seems that there is a bug indeed and it's probably > just a typo that got introduced in r278469: > - if (c->c_lock != NULL && !cc->cc_exec_entity[direct].cc_cancel) > - cancelled = cc->cc_exec_entity[direct].cc_cancel = true; > - if (cc->cc_exec_entity[direct].cc_waiting) { > + if (c->c_lock != NULL && cc_exec_cancel(cc, direct)) > + cancelled = cc_exec_cancel(cc, direct) = true; > + if (cc_exec_waiting(cc, direct)) { > Looks like a bug to me trying to make the code more readable removed a not in there! --HPSReceived on Mon Aug 31 2015 - 17:08:47 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:59 UTC