Olivier Nicole schrieb: >> > On a related note, why is slapd so damn fragile? It's a righteous pain >> > in the bum the way you have to run db_recover-X.Y /var/db/openldap-data >> > if slapd fails to start. >> Yes, this is a lot of pain. I have had issues the same way and never >> figured out what the reason was. /var/ is very often corrupted after a >> crash, power failure or unclean reboot. Maybe not slpad is that fragile, >> but db47 is. > > Last June, we had to shutdown our openldap server every night, I > noticed that a simple halt(8) would leave the bdb backend database in > a corrupted state. > > It worked well if I /usr/local/etc/rc.d/slapd stop and sync(8) a couple > of type before I halt(8). > > After that I wrote a small script that would take a backup of the ldap > data every 2 hours and keep 5 days of backup. > > It seems that Berkeley DB has a lot of options that need to be > configured to be working optimally with openldap. Maybe soft-update > should be desactivated from the filesystem where the db files reside. This hasn't anything to do with the filesystem, but with abuse of the application (read: LDAP daemons) and/or its Berkeley DB support. If you kill the application before it can write all that it needs to write, you may corrupt your database, particularly if you catch it in the middle of a page split if a page in the DB file overflows and your database isn't transactional (i. e. with log.* files - which requires application support in turn). I'm not sure about OpenLDAP, but I feel I know Berkeley DB good enough to know it does not usually create or rename files on shutdown EXCEPT if there are bulk writes pending in a transactional database (which might then trigger creation of log.* files or flushing of corresponding writes). So I'd be surprised if SOFTDEPs were a cause of db47 corruption here. SOFTDEPs may have side effects that influence the shutdown process as a whole, but then the shutdown process is broken already without softdeps. So I'd rather make sure that the daemons are shut down properly at shutdown time, i. e. run the stop scripts and make sure they sleep long enough for the application to shut down cleanly (as needed). The database should be properly closed before halt(8) draws the SIGKILL shotgun and starts firing. IOW, check that your slapd stop is properly hooked to the shutdown procedure and waits long enough. If your filesystems get corrupted at power failures, make sure your HDD write caches are turned off (unless they're battery backed or otherwise permanent caches that survive the outage); you'd also need to check if your hardware is allowed to reorder writes, and if so, if writes get reordered across flush-cache primitives (aka. write barriers). I'm unaware of current support for preventing dangerous reorders with enabled write caches in the disk/controller drivers and filesystems though.Received on Wed Sep 23 2009 - 10:52:03 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:55 UTC