MySQL thread deadlock, even after r195403 libthr fix

From: Nick Esborn <nick_at_desert.net> Date: Mon, 13 Jul 2009 13:03:20 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:51 UTC

Hello,

I am working with a 16-core Opteron server which runs five different  
MySQL 5.0 processes, each with different data sets on MyISAM tables,  
in jails.  It ran flawlessly for about six months under 7.0.

After upgrading from 7.0 to 7.2, one of the MySQL processes has begun  
exhibiting a serious problem.  Within anywhere from several hours to a  
day or two after starting the process, a query thread will lock up.   
Output from procstat -k for such a thread:

3896 100795 mysqld           -                mi_switch  
sleepq_catch_signals sleepq_wait_sig _sleep do_rw_rdlock  
__umtx_op_rw_rdlock syscall Xfast_syscall

Once this thread locks up, replication grinds to a halt, as the thread  
holds read locks.  To unwedge the situation, I have to kill -9 the  
mysqld process, myisamchk the tables, and start the process back up  
again.

7.2-RELEASE-p2 did not resolve the problem.   Late last week I  
upgraded to 8.0-BETA1 with the r195403 libthr fix, but that also  
failed to resolve the problem.  Mind you, in every other way, 8.0 is  
amazing on this 16-core server.

I had initially filed a bug report here:

   http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/135673

This was before the 16-core upgrade, and before trying 8.0.

Subsequently, though, a conversation with Jeff Roberson helped me  
realize that it's more likely a kernel issue than a MySQL one.

I hope this deadlock can be resolved.  We really need 8.0's  
performance on this class of server.

Thanks,

-nick

--
nick_at_desert.net - all messages cryptographically signed