Re: Read-triggered corruption of swap backed MD devices

From: <asomers_at_gmail.com>
Date: Fri, 24 May 2013 09:40:44 -0600
Fast work, Konstantin!  This looks like it may be the solution to an
intermittent and inexplicable bug we've been seeing that we feared may
be data corruption in ZFS.

On Fri, May 24, 2013 at 2:53 AM, Konstantin Belousov
<kostikbel_at_gmail.com> wrote:
> On Fri, May 24, 2013 at 12:19:44PM +1000, Lawrence Stewart wrote:
>> Hi all,
>>
>> I tracked the cause of a colleague's nanobsd image creation problem to
>> what appears to be some nasty behaviour with swap-backed MD devices.
>> I've verified the behaviour exists on three separate systems running
>> 10-CURRENT r250260, 9-STABLE r250824 and 9-STABLE r250925.
>>
>> The following minimal reproduction recipe (run as root)
>> deterministically triggers the behaviour for me on the 3 systems I've
>> tested:
>>
>> env MD_DEV=`mdconfig -an -t swap -s 1m -x 63 -y 16` sh -c '(fdisk -I
>> md${MD_DEV} ; bsdlabel -w -B md${MD_DEV}s1 ; bsdlabel md${MD_DEV}s1 ; dd
>> if=/dev/md${MD_DEV} of=/dev/null bs=64k ; bsdlabel md${MD_DEV}s1 ;
>> mdconfig -d -u ${MD_DEV})'
>>
>> By changing the mdconfig "-t swap" argument to "-t malloc", the bsdlabel
>> remains intact after the dd command completes.
>>
>> I've included command line recipe runs from my 10-CURRENT r250260 laptop
>> with both "-t swap" and "-t malloc" at the end of this email for reference.
>>
>> Smells like a VM related problem to me, but ENOCLUE so I would
>> appreciate some help.
>
> Confirmed, and the following patch fixed the issue for me.
>
> diff --git a/sys/dev/md/md.c b/sys/dev/md/md.c
> index e871d8f..57c5b57 100644
> --- a/sys/dev/md/md.c
> +++ b/sys/dev/md/md.c
> _at__at_ -829,7 +829,9 _at__at_ mdstart_swap(struct md_s *sc, struct bio *bp)
>                 m = vm_page_grab(sc->object, i, VM_ALLOC_NORMAL |
>                     VM_ALLOC_RETRY);
>                 if (bp->bio_cmd == BIO_READ) {
> -                       if (m->valid != VM_PAGE_BITS_ALL)
> +                       if (m->valid == VM_PAGE_BITS_ALL)
> +                               rv = VM_PAGER_OK;
> +                       else
>                                 rv = vm_pager_get_pages(sc->object, &m, 1, 0);
>                         if (rv == VM_PAGER_ERROR) {
>                                 vm_page_wakeup(m);
> _at__at_ -854,6 +856,8 _at__at_ mdstart_swap(struct md_s *sc, struct bio *bp)
>                 } else if (bp->bio_cmd == BIO_WRITE) {
>                         if (len != PAGE_SIZE && m->valid != VM_PAGE_BITS_ALL)
>                                 rv = vm_pager_get_pages(sc->object, &m, 1, 0);
> +                       else
> +                               rv = VM_PAGER_OK;
>                         if (rv == VM_PAGER_ERROR) {
>                                 vm_page_wakeup(m);
>                                 break;
> _at__at_ -868,6 +872,8 _at__at_ mdstart_swap(struct md_s *sc, struct bio *bp)
>                 } else if (bp->bio_cmd == BIO_DELETE) {
>                         if (len != PAGE_SIZE && m->valid != VM_PAGE_BITS_ALL)
>                                 rv = vm_pager_get_pages(sc->object, &m, 1, 0);
> +                       else
> +                               rv = VM_PAGER_OK;
>                         if (rv == VM_PAGER_ERROR) {
>                                 vm_page_wakeup(m);
>                                 break;
Received on Fri May 24 2013 - 13:40:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:38 UTC