Re: Port of OpenBSD's sdiff

From: LI Xin <delphij_at_delphij.net> Date: Tue, 26 Jun 2007 17:03:47 +0800 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:13 UTC

Andrey Chernov wrote:
> On Tue, Jun 26, 2007 at 10:11:58AM +0200, Ollivier Robert wrote:
>> According to Xin LI:
>>> Our current implementation is slower than many other implementation,
>>> especially the BSD licensed PCRE.  This has in turn made a lot of our
>>> utilities slow.  For instance sed -e 's/^foo [0-9]{3} bar.+$/\1/g' seems
>>> to use O(N^2) time where N is the text being processed.
>> I'm currently looking into replacing our ancient library (based on H.
>> Spencer code from decades ago) with either PCRE (which is nicely BSD
>> licensed as you say) or the new code from Mr. Spencer (taken from Tcl or
>> postgresql) or even maybe Oniguruma, the new library used by Ruby.
>>
>> I agree, anything will be better than the one we have.
> 
> Please choose variant which supports multibyte characters for sure.

If memory serves me right, all of re libraries Ollivier has mentioned
supports multibyte characters.  Maybe we should create or find some test
cases to make sure there is no regression?

Cheers,
-- 
Xin LI <delphij_at_delphij.net>	http://www.delphij.net/
FreeBSD - The Power to Serve!