Re: Improvements to fsck performance in -current ...?

From: Bill Moran <wmoran_at_potentialtech.com>
Date: Fri, 03 Oct 2003 13:11:09 -0400
Jens Rehsack wrote:
> Don Lewis wrote:
> 
>> On  2 Oct, Terry Lambert wrote:
> 
> [...]
> 
>>> Actually, write caching is not so much the problem, as the disk
>>> reporting that the write has completed before the contents of
>>> the transaction saved in the write cache have actually been
>>> committed to stable storage.
>>>
>>> Unfortunately, IDE disks do not permit disconnected writes, due
>>> to a bug in the original IDE implementation, which has been
>>> carried forward for [insert no good reason here].
>>>
>>> Therefore IDE disks almost universally lie to the driver any
>>> time write caching is enabled on an IDE drive.
>>>
>>> In most cases, if you use SCSI, the problem will go away.
>>
>> Nope, they "lie" as well unless you turn of the WCE bit.  Fortunately
>> with tagged command queuing there is very little performance penalty for
>> doing this in most cases.  The main exception to this is when you run
>> newfs which talks to the raw partition and only has one command
>> outstanding at a time.
>>
>> Back in the days when our SCSI implementation would spam the console
>> whenever it reduced the number of tagged openings because the drive
>> indicated that its queue was full, I'd see the number of tagged openings
>> stay at 63 if write caching was disabled, but the number would drop
>> significantly under load (50%?) if write caching was enabled.  I always
>> suspected that the drive's cache was full of data for write commands
>> that it had indicated to the host as being complete even though the data
>> hadn't been written to stable storage.
>>
>> Unfortunately SCSI drives all seem to ship with the WCE bit set,
>> probably for "benchmarking" reasons, so I always have to remember to
>> turn this bit off whenever I install a new drive.
> 
> A message from this morning ('file system (UFS2) consistancy
> after -current crash?') to this list describes exactly the
> situation on my fileserver a few month ago, except my machine
> runs with FreeBSD 4-STABLE and has an ICP-Vortex 6528RD controller.
> 
> I think, disk's or controllers (short hardware) write cache
> is a problem. Maybe it shouldn't be in theory, but it is in
> real world :-)

This is somewhat relevent to a discussion occurring this week on the
PostgreSQL performance mailing list.

A fellow was testing a number of caching options for disk drives, in
conjunction with the performance impact it had on Postgre.  Near the
end of the discussion and his testing, he decided to do a plug test
(i.e., pull the power plug out of the wall while Postgre was running a
benchmark and see if the database was recoverable on reboot).

The tests don't 100% apply, since he was testing with Linux and XFS,
but I think the results speak VOLUMES!

Every single plug test with WC enabled on the IDE drives resulted in
an unrecoverable database - every time, even with XFS' journalling,
and no matter what sync options he had enabled in Postgres.

Every single plug test with WC disabled on the IDE drives resulted
in a filesystem and database that was recoverable, even when sync
was turned totally off in Postgres.

Additionally, he noticed that turning WC on resulted in something
like 40x performance improvement.

To me, this means:
a) if you want reliable, don't use IDE with WC
b) if you want reliable and fast, don't use IDE, period, use SCSI.

-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com
Received on Fri Oct 03 2003 - 08:11:12 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:24 UTC