Re: UTF-8 by default?

From: Tim Čas <darkuranium_at_gmail.com> Date: Thu, 21 Jul 2016 01:39:33 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:06 UTC

I managed to find some time to have a closer look right now, and there
isn't a problem with the `read(2)` (meaning it's a false positive).
The code could use some cleanup for easier auditability (or maybe not
... "if it ain't broken, don't fix it!"), but it's otherwise not
broken --- well, at least where Coverity reported the issue.

On 21 July 2016 at 00:14, Tim Čas <darkuranium_at_gmail.com> wrote:
> On 20 July 2016 at 22:23, Don Lewis <truckman_at_freebsd.org> wrote:
>> It passes a fixed-length non-NUL terminated buffer (returned by read(2))
>> to mbrtowc().  In addition to the lack of termination, the buffer could
>> also contain a partial character at its beginning or end if the contents
>> are UTF-8.
>>
>> The Coverity ID is 978825.
>
> I don't have access to Coverity, but with boru's help, I managed to
> check the lines. There is no problem as far as I can tell --- yes, the
> buffer is not NUL-terminated [1], *BUT* `mbrtowc(3)` takes a `len`
> argument (which is returned from said `read(2)`), so it never tries to
> read out of scope [2,3].
>
> The problem might still be elsewhere, though --- the code is somewhat
> hairy, so I'll give it a closer check tomorrow.
>
> [1] https://svnweb.freebsd.org/base/head/usr.bin/wc/wc.c?view=markup#l277
> [2] https://svnweb.freebsd.org/base/head/usr.bin/wc/wc.c?view=markup#l290
> [3] `man 3 mbrtowc`