Re: UTF-8 by default?

From: Don Lewis <truckman_at_FreeBSD.org>
Date: Wed, 20 Jul 2016 11:33:14 -0700 (PDT)
On 20 Jul, Baptiste Daroussin wrote:
> On Wed, Jul 20, 2016 at 10:47:45AM -0230, Jonathan Anderson wrote:
>> On 20 Jul 2016, at 9:13, Tim Čas wrote:
>> 
>> > So, without further ado:
>> > 1) What are the reasons that UTF-8 isn't the default yet?
>> > 2) Would it be possible to make this the default in 11.0? What about
>> > 12.0?
>> > 3) Assuming an effort is started towards making UTF-8 the default,
>> > what changes would be required?
>> 
>> At least according to one of my students (who makes more extensive use of
>> i18n than I do), enabling UTF-8 by default is pretty straightforward:
>> 
>> https://github.com/musec/freebsd/wiki/Common-setup#utf-8-support
> 
> the LC_COLLATE=C is not needed anymore with freebsd 11+
>> 
>> If there's anything missing there, I'd love to hear about it.
>> 
> 
> Lot of work has been done during the 11.0 development the following issues were
> fixed:
> 
> /bin/sh not able to handle utf-8 (fixed by fixing the bug in libedit)
> no unicode collation: fixed but still very fresh code
> vi: there was a potential corruption when opening a file in an encoding which is
> not unicode in a unicode env, now is does not corrupt anything anymore but still
> says it is unhappy
> finger(1) has been fixed for multibytes names (I know noone care about that one
> :))
> 
> On the list of still known issues:
> * important:
>   - csh does not handle unicode
>   - regex in libc: it does not handle unicode right (except if I have missed
>     something) and needs to be either fixed either switch to libtre + custom
>     patches (there was a summer of code about it long ago and dfly went that
>     way)
>   - unicode support in our old groff is pretty bad, I plan to replace it with
>     heirloom-doctools which does handle unicode propertly (as far I have tested
>     at least)
>   - edit(1) does not handle multibyte
> 
> * medium (minor?)
>   - login(1) does not handle unicode properly
> 
> * minor:
>   - lots of base tools (minor one like nl and friends are not multibyte
>     aware in lot of cases, probably merging the work done by Ingo Schwarze on
>     those tools on OpenBSD might be useful, but I have no plan to do it)
>   - vi needs improvement in multiencoding support I haven't checked the latest
>     modification on vi upstream about that
> 
> There might be more, but that is all that comes out of my head right now

wc(1) has problems with its multibyte support pointed out by Coverity
as I recall.
Received on Wed Jul 20 2016 - 16:33:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:06 UTC