Re: Native Encryption for ZFS on FreeBSD CFT

From: Thomas Caputi <tcaputi_at_datto.com> Date: Wed, 22 Aug 2018 23:21:53 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:17 UTC

> That doesn't answer the question about what happens when dedup is turned off.  In that case, is the HMAC still used as the IV?  If so, then watermarking attacks are still possible.

Quoting the comment from the code above: "For non-dedup blocks we
derive the IV randomly". When dedup is enabled, we do leak this
information, but the dedup table already leaks that information
anyway. The dedup table needs to be in plaintext so that we can repair
it even when keys are not loaded. This is a known and documented trade
off of using encryption + dedup.

> Only encrypting L0 blocks also leaks a lot of information.  That means that, if encryption is set to anything but "off", watermarking attacks will still be possible based on the size and sparsity of a file.  Because I believe that with any encryption mode, ZFS turns continuous runs of zeros into holes

First of all, with encryption=off, watermarking attacks are really
quite easy :). The information that can be gained about a file from
ZFS by looking at the raw disk are:

1) The size of the file (rounded up to the nearest sector size):
Almost all applications that encrypt data will leak the approximate
size of the protected payload.

2) The locations of holes within a file: ZFS does not turn runs of
zeros into holes if you have compression off. However, data that is
never written is maintained as a hole (ie if you never write any data
to block 3 of a file). You are correct that technically this is a
small leak of information, but we decided while designing the
encryption scheme that the performance and space savings are worth it
here. Is this enough information to be an attack vector? I would argue
not, but if you are paranoid you could always turn compression off and
fill in all the holes of your files with zeros.

3) If dedup is on, you can see which blocks have deduped against other
blocks within a clone family. Encrypted dedup only works within
applications that share the same master encryption key, which is
essentially just snapshots and clones of snapshots. You cannot write
data to one encrypted dataset and analyze the dedup tables to see i
the data you wrote deduped against another dataset's data.

4) If compression + encryption is on a CRIME attack is possible, but
in almost every scenario this attack is impractical. It requires the
filesystem to have the key loaded, an application that appends a
secret to the data controlled by an attacker, the attacker requires
root access to the running system (to read the raw disk without
rebooting and unloading the encryption key), and the attacker needs to
be able to do many iterations of writing this attacker + secret data
to disk and checking the resulting plaintext.

During the implementation of native ZFS encryption we evaluated these
and came to the conclusion that the security risks here are easily
outweighed by the usability and performance benefits. If you have any
further questions about the design, feel free to email me again or
take a look at the (largely diagram based) docs on the implementation:
https://docs.google.com/presentation/d/1km-z3MVNHYwlQLY6yEC3iq-TD05eredH9Ih4umGdkJw/edit?usp=sharing
On Wed, Aug 22, 2018 at 6:39 PM Matthew Macy <mmacy_at_freebsd.org> wrote:
>
> Hi Thomas,
>
> Alan believes that, even with dedup disabled, the ZFS native encryption support is vulnerable to watermarking attacks. I don't have enough exposure to crypto to pass any judgement and was hoping that you'd share your point of view. Thanks in advance.
>
> -M
>
>
>
> On Wed, Aug 22, 2018 at 12:42 PM Alan Somers <asomers_at_freebsd.org> wrote:
>>
>> Only encrypting L0 blocks also leaks a lot of information.  That means that, if encryption is set to anything but "off", watermarking attacks will still be possible based on the size and sparsity of a file.  Because I believe that with any encryption mode, ZFS turns continuous runs of zeros into holes.  And I don't see anything in zio_crypt.c that addresses that.
>> -Alan
>>
>> On Wed, Aug 22, 2018 at 1:23 PM Sean Fagan <sef_at_ixsystems.com> wrote:
>>>
>>> On Aug 22, 2018, at 12:20 PM, Alan Somers <asomers_at_freebsd.org> wrote:
>>> > ]That doesn't answer the question about what happens when dedup is turned off.  In that case, is the HMAC still used as the IV?  If so, then watermarking attacks are still possible.  If ZFS switches to a random IV when dedup is off, then it would probably be ok.
>>>
>>> From the same file:
>>>
>>>  * Initialization Vector (IV):
>>>  * An initialization vector for the encryption algorithms. This is used to
>>>  * "tweak" the encryption algorithms so that two blocks of the same data are
>>>  * encrypted into different ciphertext outputs, thus obfuscating block patterns.
>>>  * The supported encryption modes (AES-GCM and AES-CCM) require that an IV is
>>>  * never reused with the same encryption key. This value is stored unencrypted
>>>  * and must simply be provided to the decryption function. We use a 96 bit IV
>>>  * (as recommended by NIST) for all block encryption. For non-dedup blocks we
>>>  * derive the IV randomly. The first 64 bits of the IV are stored in the second
>>>  * word of DVA[2] and the remaining 32 bits are stored in the upper 32 bits of
>>>  * blk_fill. This is safe because encrypted blocks can't use the upper 32 bits
>>>  * of blk_fill. We only encrypt level 0 blocks, which normally have a fill count
>>>  * of 1. The only exception is for DMU_OT_DNODE objects, where the fill count of
>>>  * level 0 blocks is the number of allocated dnodes in that block. The on-disk
>>>  * format supports at most 2^15 slots per L0 dnode block, because the maximum
>>>  * block size is 16MB (2^24). In either case, for level 0 blocks this number
>>>  * will still be smaller than UINT32_MAX so it is safe to store the IV in the
>>>  * top 32 bits of blk_fill, while leaving the bottom 32 bits of the fill count
>>>  * for the dnode code.
>>>
>>> Sean
>>>
>>>