On Mon, 3 May 2004, Burkard Meyendriesch wrote: > On Mon, 3 May 2004 15:00:15 +1000 (EST) Bruce Evans wrote: > > ... > > > > You need to turn PUC_FASTINTR back off to test the patch. > > > I disabled "options PUC_FASTINTR" in my kernel configuration file > and made a new kernel. The actual boot message is: > > May 3 09:18:59 Reineke kernel: puc0: <Moxa Technologies, Smartio C104H/PCI> port 0xa000-0xa00f,0xa400-0xa43f,0xa800-0xa87f irq 19 at device 14.0 on pci0 > May 3 09:18:59 Reineke kernel: puc0: Reserved 0x40 bytes for rid 0x18 type 4 at 0xa400 > May 3 09:18:59 Reineke kernel: sio4: <Moxa Technologies, Smartio C104H/PCI> on puc0 > May 3 09:18:59 Reineke kernel: sio4: type 16550A > May 3 09:18:59 Reineke kernel: sio4: unable to activate interrupt in fast mode - using normal mode > > Doing my sio5 Palm backup test now I again get lots of silo overflows; > the input rate is in a range from 100 to 10000 chars/s and the puc > interrupt rate behaves similar: from 12 to 1200 ints/s. After several > minutes the Palm backup software gives up due to protocol errors > (maybe as a result of massive character losses). > > If your patch only becomes effective without "options PUC_FASTINTR" it > does not seem to solve the sio interrupt problem in my environment. So much for my theory that the problem is contention with a low priority thread. Since holding a spin lock or otherwise disabling interrupts for too long would also break the PUC_FASTINTR case, the problem must be that the highest priority runnable thread (which with my patch can only be the sio (puc) ithread if that thread is runnable) is not always run. This is quite likely to be just the old bug that handling of interrupts which can't be handled immediately might be delayed for too long. From ithread_schedule(): % mtx_lock_spin(&sched_lock); % if (TD_AWAITING_INTR(td)) { % CTR2(KTR_INTR, "%s: setrunqueue %d", __func__, p->p_pid); % TD_CLR_IWAIT(td); % setrunqueue(td); % if (do_switch && % (ctd->td_critnest == 1) ) { % KASSERT((TD_IS_RUNNING(ctd)), % ("ithread_schedule: Bad state for curthread.")); % if (ctd->td_flags & TDF_IDLETD) % ctd->td_state = TDS_CAN_RUN; /* XXXKSE */ % mi_switch(SW_INVOL); % } else { % curthread->td_flags |= TDF_NEEDRESCHED; % } % } else { When the switch can't be done immediately (which is now always (!!)), this just sets TDF_NEEDRESCHED, but TDF_NEEDRESCHED is not checked until return to user mode, so the switch is indefinitely delayed. critical_enter() now disables interrupts, so the thread that got interrupted must not be in a critical section. When we return to it without checking TDF_NEEDRESCHED, it continues until it either gives up control or perhaps until it is interrupted by an interrupt whose handler is missing the bugs here (a clock interrupt perhaps). (!!) For hardware interrupts, ithread_schedule() is now always called from a critical section. Interrupts are now disabled in critical sections, so ctd->td-critnest is always 0 when ithread_execute_handlers() is called, always 1 when ithread_schedule() is called, and always 2 when it is checked above. So the switch can only occur when ithread_schedule() is called in contexts that are missing this foot-shooting. There may be a few such calls, but software interrupts are often scheduled from critical regions too. I've fixed the primary bug in 2 different versions of interrupt handling (mine and last October's version in -current). Rescheduling must be considered whenever td_critnest is decremented to 0. The following patch worked last October, but its infrastructure went away last November: %%% --- critical.c~ Sat Aug 16 20:39:26 2003 +++ critical.c Fri Oct 31 01:09:13 2003 _at__at_ -37,4 +37,5 _at__at_ #include <sys/pcpu.h> #include <sys/proc.h> +#include <sys/resourcevar.h> #include <sys/sysctl.h> #include <sys/ucontext.h> _at__at_ -56,4 +57,6 _at__at_ #endif +#include <ddb/ddb.h> + void i386_unpend(void); /* NOTE: not static, called from assembly */ _at__at_ -67,8 +70,40 _at__at_ register_t eflags; struct thread *td; +#ifdef DDB + static int traced; + + if (db_active && !traced) { + traced = 1; + printf("cpu_unpend: naughty db\n"); + backtrace(); + } +#endif + +#define ALLOW_PSL_I_VIOLATION + +#ifndef ALLOW_PSL_I_VIOLATION + /* + * CPU interrupts can be masked here (e.g., if we are called (nested) + * from code that does "intr_disable(); critical_enter(); ... + * critical_exit();"), and interrupts can even be pending then (e.g., + * if we are called (nested) from Xintr* as part of ICU locking in + * the SMP case after Xintr* has interrupted critical_enter() after + * the latter decremented td->td_intr_nesting_level to 0 but before it + * called us). If so, don't unpend interrupts now. + */ + if ((read_eflags() & PSL_I) == 0) + return; +#endif td = curthread; +#ifdef RESCHED_HERE +more: +#endif eflags = intr_disable(); if (PCPU_GET(int_pending)) { +#ifndef ALLOW_PSL_I_VIOLATION + if (td->td_intr_nesting_level != 0) + Debugger("cpu_unpend: intr_nesting_level != 0"); +#endif ++td->td_intr_nesting_level; i386_unpend(); _at__at_ -76,4 +111,53 _at__at_ } intr_restore(eflags); + +#ifdef RESCHED_HERE + /* + * We kept td_critnest > 0 in i386_unpend() to prevent context + * switches away from us. This results in ithread_schedule() + * being called in an abnormal environment so it doesn't switch + * context to hardware interrupt handlers in the normal way. It + * sets TDF_NEEDRESCHED, but that is only intended for use in + * ast(). Use it here too to ensure running the highest priority + * thread. + * + * Prenting context switches away from us is probably not needed + * except possibly as an optimization. We have to do it if we + * use the int_pending lock since it prevents other threads + * doing any unpending, but letting the other threads do the + * unpending should be safe if we are careful enough. The calls + * to i386_unpend() from Xfastintr* need special care since they + * also have interrupts masked in the CPU. Our check of + * TDF_NEEDRESCHED doesn't help for these calls. + * + * td_critnest > 0 also prevents swi_sched() actually switching + * context. It is normal and probably even correct for swi_sched() + * to not switch immediately, but this shows that swi_sched() + * is broken as designed. It should just schedule the scheduling + * of a SWI; actual scheduling should not occur until a switch is + * possible. swi_sched() inherits the scheduling bug from + * ithread_schedule() -- in cases where a switch is not possible, + * it sets TDF_NEEDRESCHED but only ast() and now here checks this + * flag. + * + * XXX don't switch when cold -- can't get back then. Interrupts + * can happen when cold, so there is another problem getting them + * handled as soon as we become warm. + */ + if (td->td_flags & TDF_NEEDRESCHED && !cold) { + mtx_lock_spin(&sched_lock); + if (td->td_flags & TDF_IDLETD) + td->td_state = TDS_CAN_RUN; /* XXXKSE */ + td->td_proc->p_stats->p_ru.ru_nivcsw++; + mi_switch(); + /* + * XXX can't do mtx_unlock_spin() here due to recursion + * problems. + */ + _release_lock_quick(&sched_lock); + td->td_critnest = 0; + goto more; + } +#endif /* RESCHED_HERE */ } %%% Don't try porting this to -current unless you understand all the above :-). This bug is also popular in signal handlers. E.g., top(1)'s signal handler was changed from theoretically broken but practically working to theoretically working but practically broken by changing it to just set a flag in a signal handler. Checking flags set by signal handlers soon enough after return from the handlers is not easy and is not done in top. I've fixed variants of the problem of being nested 2 deep in a critical region in my version of interrupt handling only. In my version, it expected that ithread_schedule() is called with td_critnest == 1. ithread_schedule() bumps td_critnest to 2 as in -current, and all context switches are required to be done with td_critnest == 2. This requires minor adjustments elsewhere. If TDF_NEEDRESCHED is set on return from ithread_schedule(), then it is handled in the caller by calling mi_switch() directly as above. It can also get set while the interrupt handler is running (but perhaps only if it doesn't get handled right on leaving critical sections). It can get set again while the switched-to process is running, so there must be a loop to keep handling it until it is clear. The loop probably belongs in MD code because the there may be other interrupts to consider. BruceReceived on Tue May 04 2004 - 05:05:18 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:53 UTC