Re: New SCHED_SMP diff.

From: Suleiman Souhlal <ssouhlal_at_FreeBSD.org> Date: Mon, 2 Jul 2007 18:08:32 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:13 UTC

On Jul 2, 2007, at 3:18 PM, Attilio Rao wrote:

> 2007/7/2, Jeff Roberson <jroberson_at_chesapeake.net>:
>> I forgot:
>>
>> http://people.freebsd.org/~jeff/schedsmp.diff
>>
>> --- amd64/amd64/cpu_switch.S	6 Jun 2007 07:35:07 -0000	1.158
>> +++ amd64/amd64/cpu_switch.S	2 Jul 2007 05:43:31 -0000
>> _at__at_ -148,13 +148,7 _at__at_
>>  	movq	%cr3,%rax
>>  	cmpq	%rcx,%rax			/* Same address space? */
>>  	jne	swinact
>> -	movq	%rdx, TD_LOCK(%rdi)		/* Release the old thread */
>> -	/* Wait for the new thread to become unblocked */
>> -	movq	$blocked_lock, %rdx
>> -1:
>> -	movq	TD_LOCK(%rsi),%rcx
>> -	cmpq	%rcx, %rdx
>> -	je	1b
>> +	xchgq	%rdx, TD_LOCK(%rdi)		/* Release the old thread */
>
> I don't think here you need an atomic instruction, a memory barrier
> throug sfence is good enough in order to make thread migration
> consistent.

SFENCE is not needed. Stores are already strongly ordered wrt other  
stores on x86 (unless you use write-combining memory or non-temporal  
stores).
The main advantage of using an atomic operation when unlocking is  
that it should make the store visible to other CPUs faster (so they  
don't spin as long), although I think you'll have a hard time  
noticing a difference in a macrobenchmark.

-- Suleiman