Re: CAM Target Layer available

From: Kenneth D. Merry <ken_at_FreeBSD.org>
Date: Wed, 11 Jan 2012 22:04:58 -0700
On Wed, Jan 04, 2012 at 21:53:11 -0700, Kenneth D. Merry wrote:
> 
> The CAM Target Layer (CTL) is now available for testing.  I am planning to
> commit it to to head next week, barring any major objections.
> 
> CTL is a disk and processor device emulation subsystem originally written
> for Copan Systems under Linux starting in 2003.  It has been shipping in
> Copan (now SGI) products since 2005.
> 
> It was ported to FreeBSD in 2008, and thanks to an agreement between SGI
> (who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is
> available under a BSD-style license.  The intent behind the agreement was
> that Spectra would work to get CTL into the FreeBSD tree.
> 
> The patches are against FreeBSD/head as of SVN change 229516 and are
> located here:
> 
> http://people.freebsd.org/~ken/ctl/ctl_diffs.20120104.4.txt.gz
> 
> The code is not "perfect" (few pieces of software are), but is in good
> shape from a functional standpoint.  My intent is to get it out there for
> other folks to use, and perhaps help with improvements.
> 
> There are a few other CAM changes included with these diffs, some of which
> will be committed separately from CTL, some concurrently.  This is a quick
> summary:
> 
>  - Fix a panic in the da(4) driver when a drive disappears on boot.
>  - Fix locking in the CAM EDT traversal code.
>  - Add an optional sysctl/tunable (disabled by default) to suppress
>    "duplicate" devices.  This most frequently shows up with dual ported SAS
>    drives.
>  - Add some very basic error injection into the da(4) driver.
>  - Bump the length field in the SCSI INQUIRY CDB to 2 bytes to line up with
>    more recent SCSI specs.
> 
> CTL Features:
> ============
> 
>  - Disk and processor device emulation.
>  - Tagged queueing
>  - SCSI task attribute support (ordered, head of queue, simple tags)
>  - SCSI implicit command ordering support.  (e.g. if a read follows a mode
>    select, the read will be blocked until the mode select completes.)
>  - Full task management support (abort, LUN reset, target reset, etc.)
>  - Support for multiple ports
>  - Support for multiple simultaneous initiators
>  - Support for multiple simultaneous backing stores
>  - Persistent reservation support
>  - Mode sense/select support
>  - Error injection support
>  - High Availability support (1)
>  - All I/O handled in-kernel, no userland context switch overhead.
> 
> (1) HA Support is just an API stub, and needs much more to be fully
>     functional.  See the to-do list below.
> 
> Configuring and Running CTL:
> ===========================
> 
>  - After applying the CTL patchset to your tree, build world and install it
>    on your target system.
> 
>  - Add 'device ctl' to your kernel configuration file.
> 
>  - If you're running with a 8Gb or 4Gb Qlogic FC board, add
>    'options ISP_TARGET_MODE' to your kernel config file.  'device ispfw'
>    or loading the ispfw module is also recommended.
> 
>  - Rebuild and install a new kernel.
> 
>  - Reboot with the new kernel.
> 
>  - To add a LUN with the RAM disk backend:
> 
> 	ctladm create -b ramdisk -s 10485760000000000000
> 	ctladm port -o on
> 
>  - You should now see the CTL disk LUN through camcontrol devlist:
> 
> scbus6 on ctl2cam0 bus 0:
> <FREEBSD CTLDISK 0001>             at scbus6 target 1 lun 0 (da24,pass32)
> <>                                 at scbus6 target -1 lun -1 ()
> 
>    This is visible through the CTL CAM SIM.  This allows using CTL without
>    any physical hardware.  You should be able to issue any normal SCSI
>    commands to the device via the pass(4)/da(4) devices.
> 
>    If any target-capable HBAs are in the system (e.g. isp(4)), and have
>    target mode enabled, you should now also be able to see the CTL LUNs via
>    that target interface.
> 
>    Note that all CTL LUNs are presented to all frontends.  There is no
>    LUN masking, or separate, per-port configuration.
> 
>  - Note that the ramdisk backend is a "fake" ramdisk.  That is, it is
>    backed by a small amount of RAM that is used for all I/O requests.  This
>    is useful for performance testing, but not for any data integrity tests.
> 
>  - To add a LUN with the block/file backend:
> 
> 	truncate -s +1T myfile
> 	ctladm create -b block -o file=myfile
> 	ctladm port -o on
> 
>  - You can also see a list of LUNs and their backends like this:
> 
> # ctladm devlist
> LUN Backend       Size (Blocks)   BS Serial Number    Device ID       
>   0 block            2147483648  512 MYSERIAL   0     MYDEVID   0     
>   1 block            2147483648  512 MYSERIAL   1     MYDEVID   1     
>   2 block            2147483648  512 MYSERIAL   2     MYDEVID   2     
>   3 block            2147483648  512 MYSERIAL   3     MYDEVID   3     
>   4 block            2147483648  512 MYSERIAL   4     MYDEVID   4     
>   5 block            2147483648  512 MYSERIAL   5     MYDEVID   5     
>   6 block            2147483648  512 MYSERIAL   6     MYDEVID   6     
>   7 block            2147483648  512 MYSERIAL   7     MYDEVID   7     
>   8 block            2147483648  512 MYSERIAL   8     MYDEVID   8     
>   9 block            2147483648  512 MYSERIAL   9     MYDEVID   9     
>  10 block            2147483648  512 MYSERIAL  10     MYDEVID  10     
>  11 block            2147483648  512 MYSERIAL  11     MYDEVID  11    
> 
>  - You can see the LUN type and backing store for block/file backend LUNs
>    like this:
> 
> # ctladm devlist -v
> LUN Backend       Size (Blocks)   BS Serial Number    Device ID       
>   0 block            2147483648  512 MYSERIAL   0     MYDEVID   0     
>       lun_type=0
>       num_threads=14
>       file=testdisk0
>   1 block            2147483648  512 MYSERIAL   1     MYDEVID   1     
>       lun_type=0
>       num_threads=14
>       file=testdisk1
>   2 block            2147483648  512 MYSERIAL   2     MYDEVID   2     
>       lun_type=0
>       num_threads=14
>       file=testdisk2
>   3 block            2147483648  512 MYSERIAL   3     MYDEVID   3     
>       lun_type=0
>       num_threads=14
>       file=testdisk3
>   4 block            2147483648  512 MYSERIAL   4     MYDEVID   4     
>       lun_type=0
>       num_threads=14
>       file=testdisk4
>   5 block            2147483648  512 MYSERIAL   5     MYDEVID   5     
>       lun_type=0
>       num_threads=14
>       file=testdisk5
>   6 block            2147483648  512 MYSERIAL   6     MYDEVID   6     
>       lun_type=0
>       num_threads=14
>       file=testdisk6
>   7 block            2147483648  512 MYSERIAL   7     MYDEVID   7     
>       lun_type=0
>       num_threads=14
>       file=testdisk7
>   8 block            2147483648  512 MYSERIAL   8     MYDEVID   8     
>       lun_type=0
>       num_threads=14
>       file=testdisk8
>   9 block            2147483648  512 MYSERIAL   9     MYDEVID   9     
>       lun_type=0
>       num_threads=14
>       file=testdisk9
>  10 ramdisk                   0    0 MYSERIAL   0     MYDEVID   0     
>       lun_type=3
>  11 ramdisk     204800000000000  512 MYSERIAL   1     MYDEVID   1     
>       lun_type=0
> 
>  - To see system throughput, use ctlstat(8):
> 
> # ctlstat -t
>           System Read          System Write          System Total
>    ms  KB/t tps  MB/s    ms  KB/t tps  MB/s    ms  KB/t tps  MB/s 
>  1.71 50.64   0  0.00  1.24 512.00   0  0.03  2.05 245.20   0  0.03    1.0%
>  0.00  0.00   0  0.00  1.12 512.00 564 282.00  1.12 512.00 564 282.00    8.4%
>  0.00  0.00   0  0.00  1.27 512.00 536 268.00  1.27 512.00 536 268.00   10.0%
>  0.00  0.00   0  0.00  1.27 512.00 535 267.50  1.27 512.00 535 267.50    7.6%
>  0.00  0.00   0  0.00  1.12 512.00 520 260.00  1.12 512.00 520 260.00   10.9%
>  0.00  0.00   0  0.00  1.02 512.00 538 269.00  1.02 512.00 538 269.00   10.9%
>  0.00  0.00   0  0.00  1.10 512.00 557 278.50  1.10 512.00 557 278.50    9.6%
>  0.00  0.00   0  0.00  1.12 512.00 561 280.50  1.12 512.00 561 280.50   10.4%
>  0.00  0.00   0  0.00  1.14 512.00 502 251.00  1.14 512.00 502 251.00    6.5%
>  0.00  0.00   0  0.00  1.31 512.00 527 263.50  1.31 512.00 527 263.50   10.5%
>  0.00  0.00   0  0.00  1.07 512.00 560 280.00  1.07 512.00 560 280.00   10.3%
> 
> CTL To Do List:
> ==============
> 
>  - Use devstat(9) for CTL's statistics collection.  CTL uses a home-grown
>    statistics collection system that is similar to devstat(9).  ctlstat
>    should be retired in favor of iostat, etc., once aggregation modes are
>    available in iostat to match the behavior of ctlstat -t and dump modes
>    are available to match the behavior of ctlstat -d/ctlstat -J.
> 
>  - ZFS ARC backend for CTL.  Since ZFS copies all I/O into the ARC
>    (Adaptive Replacement Cache), running the block/file backend on top of a
>    ZFS-backed zdev or file will involve an extra set of copies.  The
>    optimal solution for backing targets served by CTL with ZFS would be to
>    allocate buffers out of the ARC directly, and DMA to/from them directly.
>    That would eliminate an extra data buffer allocation and copy.
> 
>  - Switch CTL over to using CAM CCBs instead of its own union ctl_io.  This
>    will likely require a significant amount of work, but will eliminate
>    another data structure in the stack, more memory allocations, etc.  This
>    will also require changes to the CAM CCB structure to support CTL.
> 
>  - Full-featured High Availability support.  The HA API that is in ctl_ha.h
>    is essentially a renamed version of Copan's HA API.  There is no
>    substance to it, but it remains in CTL to show what needs to be done to
>    implement active/active HA from a CTL standpoint.  The things that would
>    need to be done include:
> 	- A kernel level software API for message passing as well as DMA
> 	  between at least two nodes.
> 	- Hardware support and drivers for inter-node communication.  This
> 	  could be as simples as ethernet hardware and drivers.
> 	- A "supervisor", or startup framework to control and coordinate
> 	  HA startup, failover (going from active/active to single mode),
> 	  and failback (going from single mode to active/active).
> 	- HA support in other components of the stack.  The goal behind HA
> 	  is that one node can fail and another node can seamlessly take
> 	  over handling I/O requests.  This requires support from pretty
> 	  much every component in the storage stack, from top to bottom.
> 	  CTL is one piece of it, but you also need support in the RAID
> 	  stack/filesystem/backing store.  You also need full configuration
> 	  mirroring, and all peer nodes need to be able to talk to the
> 	  underlying storage hardware.

I checked CTL into head today, along with most of the CAM changes I
mentioned above.  My plan is to MFC CTL into stable/9 in a month.  If there
is enough interest, I can probably MFC CTL into stable/8 as well.

The only potential hiccup there is the change in the size of the inquiry
CDB length field.  I doubt many, if any, ports are using that data
structure, but it is a small API change.  (Albeit one brought on by a
standards change.)  In any case, if anyone sees any ports breakage as a
result, please let me know.

I'm planning on MFCing the other CAM changes in 2 weeks.

I decided not to put in the duplicate suppression code for now.  It's a
little kludgy.  If people think it would be valuable, I can put it in.
It's really just a stopgap until we get actual multipath and SAS probing
support in CAM.

Ken
-- 
Kenneth Merry
ken_at_FreeBSD.ORG
Received on Thu Jan 12 2012 - 04:04:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:23 UTC