Re: RELENG_7: SATA hotplug does not work

From: Jeremy Chadwick <koitsu_at_FreeBSD.org>
Date: Thu, 25 Oct 2007 01:21:53 -0700
On Thu, Oct 25, 2007 at 11:21:36AM +0400, Dmitry Morozovsky wrote:
> Dear colleagues,
> 
> with -CURRENT and RELENG_7 I am never able to hot plug SATA disk: it is always 
> entering endless loop like
> 
> ad10: 381554MB <Seagate ST3400620AS 3.AAK> at ata5-master SATA150
> ad10: detached
> 
> on every controllers I have (though it's mostly cheap side: onboard NVivia, 
> JMicron, Intel ICH7/ICH8, etc; Promise FastTrak SATA -- all they behave the 
> same)
> 
> RELENG_6 on those controllers usually survive hotplug.

We can hot-swap on our Supermicro 5015M-T+ servers successfully, which
I've confirmed using RELENG_6.

One of our new 5015M-T+ boxes is running RELENG_7, so this weekend I'll
send one of our datacenter guys out to the co-lo and attempt to perform
the below procedure.  I'll report the results in the near future.

The procedure we use is:

* In the case the disk is still mounted and usable (e.g. sign of
  bad blocks, etc.), migrate data to main (boot) disk, turn off
  any daemons using data on the disk (or restart them with appro-
  priate config changes), then umount partition(s) on bad disk

* Send datacenter folks out with replacement disk.  When they arrive,
  before any work is done, do:
    atacontrol detach adXX

* dmesg on the machine should report:
    subdiskXX: detached
    adXX: detached 

* Have datacenter folks swap disk, then do:
    atacontrol attach adXX

* dmesg on the machine should show something like:
    adXX: 238475MB <WDC WD2500KS-00MJB0 02.01C03> at ata3-master SATA300 

* Use fdisk, newfs, or sysinstall to create partitions/filesystems,
  mount it, and start restoring...

I've described my success here:

http://www.nabble.com/FreeBSD---Hot-pluggable-disks-(SATA-)-t4149798.html

However, on my home machine (nForce 4-based), if I attempt to perform a
hot-swap by removing the SATA cable then the power cable on the drive,
the kernel will either panic and the machine reboot, or the machine will
simply power-cycle on its own.

I've been told that I should be removing the power cable *first*, but I
don't see how the order would matter.

Keep in mind that the servers I mention above have a proper SATA
hot-swap backplane; supposedly this is needed for hot-swapping,
otherwise "odd things" can happen.  I presume that the backplane allows
signalling provided constantly to the controller (regardless of a disk
being removed), while the manual method on my nForce 4 machine actually
disconnects the controller -- literally -- from the drive.

-- 
| Jeremy Chadwick                                    jdc at parodius.com |
| Parodius Networking                           http://www.parodius.com/ |
| UNIX Systems Administrator                      Mountain View, CA, USA |
| Making life hard for others since 1977.                  PGP: 4BD6C0CB |
Received on Thu Oct 25 2007 - 06:21:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:20 UTC