[CFT] patch to replace the regex code

From: Gabor Kovesdan <gabor_at_FreeBSD.org> Date: Sat, 25 Jun 2011 14:51:40 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:15 UTC

Hi Folks,

you may know that in the Summer of Code programme I'm working on 
replacing the old regex code with TRE, which is a BSD-licensed 
implementation. It supports wide characters, is POSIX-compliant and has 
a good performance compared to most of the open source implementations. 
Actually, I got mixed results. With sed, in the cases that I tested, the 
performance was more or less the same and in some few cases, TRE 
finished in half of the time. On the other hand, with grep sometimes it 
was significantly slower than the current regex code but grep has always 
been a complicated case and it has its own regex code, it was just used 
for testing here. I'm still working on some optimizations but apart from 
grep, the current performance may already be satisfying for normal 
cases. This is one thing that I would ask the interested testers to 
focus on: whether habitual scripts you execute finish later or sooner. 
I've also checked the POSIX-compliance and I found some cases when TRE 
is more permissive than the current implementation but that should not 
be a problem. The patch that I provide know probably can have a cleanup 
in the contrib area but it's just an early patch purely for testing 
purposes, so please avoid nitpicking for now and only report performance 
and/or functional problems. There's a code slush now so there's plenty 
of time to arrange this if it proves ready to go to 10-CURRENT. Thanks 
for all of you, who take the effort to give it a try.

The patch is here: http://kovesdan.org/patches/tre-20110724.diff

Regards,
Gabor Kovesdan