On Sun, Jan 13, 2013 at 09:36:11PM +0200, Alexander Motin wrote: > On 13.01.2013 20:09, Marius Strobl wrote: > > On Tue, Jan 08, 2013 at 12:46:57PM +0200, Alexander Motin wrote: > >> On 06.01.2013 17:23, Marius Strobl wrote: > >>> I'm not really sure what to do about that. Earlier you already said > >>> that sched_bind(9) also isn't an option in case if td_critnest > 1. > >>> To be honest, I don't really unerstand why using a spin lock in the > >>> timecounter path makes sparc64 the only problematic architecture > >>> for your changes. The x86 i8254_get_timecount() also uses a spin lock > >>> so it should be in the same boat. > >> > >> The problem is not in using spinlock, but in waiting for other CPU while > >> spinlock is held. Other CPU may also hold spinlock and wait for > >> something, causing deadlock. i8254 code uses spinlock just to atomically > >> access hardware registers, so it causes no problems. > > > > Okay, but wouldn't that be a general problem then? Pretty much > > anything triggering an IPI holds smp_ipi_mtx while doing so and > > the lower level IPI stuff waits for other CPU(s), including on > > x86. > > The problem is general. But now it works because single smp_ipi_mtx is > used in all cases where IPI result is waited. As soon as spinning > happens with interrupts still enabled, there is no deadlocks. But > problem reappears if any different lock is used, or locks are nested. I'm having a hard time getting an alternate time counter device to work. The crystal required for the counters in the south bridge just doesn't seem to be mounted any where near it (I've not looked at the bottom of the PCB though). While the time counter part of the on- board bge(4) driven chips basically work, they don't seem to like concurrent accesses caused by the rest of bge(4). I.e. although the counter is just read, sooner or later this causes a fatal bus error. I haven't tried serializing accesses to the chip, but getting to such a complexity for just reading a non-indexed register at least doesn't feel good ... However, AFAICT the scenario you describe can't happen. On sparc64, spinlock_enter() only raises the processor interrupt level, which doesn't block the direct cross traps I've implemented remote reading of (S)TICK as (which also means that the actions such traps may perform are very limitted and must occur in interrupt context, but which are sufficient for this purpose and in turn makes them very fast). I.e. although the AP holds smp_ipi_mtx or any amount of nested spin locks, this will not deadlock in case the BSP also holds any spin lock when reading (S)TICK from it. MariusReceived on Mon Jan 21 2013 - 08:54:59 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:34 UTC