Re: [CFT] Unicode collation string and reworked locale definitions

From: Pedro Giffuni <pfg_at_FreeBSD.org>
Date: Tue, 3 Nov 2015 10:05:45 -0500
Hi Baptiste;

> Il giorno 03/nov/2015, alle ore 02:17, Baptiste Daroussin <bapt_at_FreeBSD.org> ha scritto:
> 
> On Mon, Nov 02, 2015 at 06:59:15PM -0500, Pedro Giffuni wrote:
>> First of all, congratulations to Baptiste and Marino for succeeding where
>> I failed many moons ago. Also huge thanks to Nexenta and Garret D’Amore
>> for relicensing localedef for us.
>> 
>> Concerning regex;
>> 
>> Gabor_at_ did a lot of work on libtre but according to him it was not up to the
>> task performancewise. We would also lose features if we move to libtre.
>> 
>> I think our regex code actually has most of what is needed for multibyte
>> already. I have a patch that turns on the functionality but I haven’t found
>> any brave soul that will do the testing:
>> 
>> https://people.freebsd.org/~pfg/patches/regex-multibyte.diff
>> 
> I think it this can be tested once the collation branch is merged.

Absolutely: support for collation is critical and badly needed even without
resolving the regex issues.

> Note that
> dragonfly and musl libc both uses a patched version of libtre for the regex
> implementation.
> 

I am aware. Also note that Gabor had some patches too, in order to make
it usable for bsdgrep:

https://wiki.freebsd.org/Regex

> From my non scientific testing libtre was more reliable and performant then our
> current regex.

According to Gabor, the general performance was better until you take into
account multibyte support where it was clearly inferior to GNU regex.

> Anyway it will be relatively "easy" to test using the AT&T
> testsuite the reliability and performance of both implementations: ours + your
> patch and patched libtre.
> 


What worries me about libtre is that it lacks important functionality like word
delimiters. We even brought the sysv delimiters to be more compatible with
Solaris and GNU and we can’t back those out now:

https://svnweb.freebsd.org/base?view=revision&revision=268066

Pedro.
Received on Tue Nov 03 2015 - 14:05:51 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:00 UTC