On Sun, 22 Aug 2010, Dag-Erling Smørgrav wrote: > Mike Haertel <mike_at_ducky.net> writes: >> GNU grep uses the well-known Boyer-Moore algorithm, which looks first >> for the final letter of the target string, and uses a lookup table to >> tell it how far ahead it can skip in the input whenever it finds a >> non-matching character. > > Boyer-Moore is for fixed search strings. I don't see how that > optimization can work with a regexp search unless the regexp is so > simple that you break it down into a small number of cases with known > length and final character. When I was working on making FreeGrep faster (years ago), I wrote down a few notes about possible algorithms, especially those that could be useful for fgrep functionality. I am just passing these onto the list. Some algorithms: 1. http://en.wikipedia.org/wiki/Aho-Corasick_string_matching_algorithm 2. http://en.wikipedia.org/wiki/Rabin-Karp_string_search_algorithm 3. GNU fgrep: Commentz-Walter 4. GLIMPSE: http://webglimpse.net/pubs/TR94-17.pdf (Boyer-Moore variant) Also, this may be a useful book: http://www.dcc.uchile.cl/~gnavarro/FPMbook/ Sean -- scf_at_FreeBSD.orgReceived on Sun Aug 22 2010 - 14:30:08 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:06 UTC