indefinite wait buffer

From: Arno J. Klaassen <arno_at_heho.snv.jussieu.fr>
Date: 27 May 2006 02:25:26 +0200
Hello,

we use FreeBSD amongst others for scientific calculations, and
ran into the 'indefinite wait buffer' problem on ordinary
swap/dump devices :

the swap-overhead being justified by enabling greater data-sets
to be treated and processes grosso-modo still being CPU-bound
rather than I/O(swap)-bound.

On recent RELENG_6 however, this fails (for sure on scrappy
ATA-devices, rather easy as well on SCSI-devices though they
seem to persist 'a couple of' 'indefinite wait buffer'
warnings).

I tested today on an amd64-notebook with 1G physmem and 4G swap
on a from-the-shelf ATA-disk.
I wrote the following code :

  int
  main (int  argc, char **argv)
  {
    unsigned long maxpage;
    int * base, * ptr;
 
    _malloc_options = "AJ";

    maxpage = strtol(argv[1],(char **)NULL, 10) * M_SIZE;
    fprintf (stderr, "Allocing %ld Bytes\n", maxpage);
    base = (int *)(malloc (maxpage));

    if (base == NULL ) { fprintf (stderr, "Jammer\n"); }
    while (0 == 0) {
      int * ptr = base;
      unsigned int i = 0;

      for (i=0; i< maxpage/sizeof(int); i++) {
        *(ptr++) += 1;
      }
      fprintf (stderr, "Loop <%d> done\n", iter);
      iter++;
    }
    exit (0);
  }

Calling this (on RELENG_6) with an argument in between
1024 and 1500 in a few minutes deadlocks the notebook
with an 'indefinite wait buffer' in /var/log/messages after reboot.

After some fiddling I came to the attached amateuristic patch :
swap_pager.c has a heuristiquely (I suppose) timeout of 20 seconds
for a msleep call; I changed this for a timeout based
on a presupposed pessimistic minimal througput for the swapping
device multiplied by the minimum of swapsize and physmem.

With this patch I can run the above code without deadlock even with
4096 (Meg) as argument..

Two remarks :

 1 I abuse linux.ko/linprocfs to correctly initialise my code
 2 This is no solution to the real 'indefinite wait buffer' problems
   since at shutdown it still panics with 'swap_pager_force_pagein: read 
   from swap failed', but at least it keeps the system functional while
   working.

I hope someone can comment this idea.

Best regards, Arno



Received on Fri May 26 2006 - 22:25:30 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:56 UTC