Re: uniq truncates lines > 2048 bytes

From: Kris Kennaway <kris_at_obsecurity.org>
Date: Tue, 25 Jan 2005 14:38:34 -0800
On Wed, Jan 26, 2005 at 09:10:47AM +1100, Tim Robbins wrote:
> On Tue, Jan 25, 2005 at 11:51:51AM -0600, Scot Hetzel wrote:
> > I noticed that if a file has lines > 2048 bytes, uniq will truncate
> > the line to LINE_MAX (2048 bytes). An easy way to test this is to do
> > the following:
> > 
> > cd /usr/ports/accessibility/gnomemag
> > make fetch-list > test.list
> > make fetch-list >> test.list
> > uniq test.list > test2.list
> > 
> > test2.list should be half the size of test.list, but it is 2048 bytes.
> > 
> > I have come up with a patch to uniq that fixes this problem.
> > 
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=76578
> 
> This looks good except for failure to check for realloc() returning NULL
> and a few minor style problems. It may be possible to use fgetwln()
> to read lines instead of getwc() + realloc() etc., but this function is
> new and peculiar to FreeBSD.
> 
> I was planning on going through all text-processing utilities in the base
> system some time and either fixing line length problems or documenting them,
> similar to what I did with multibyte character support. I may make a start
> at that today.

If someone could fix comm(1) that would be a big help for me, because
I have a local hack I have to carry around in all of my local package
source trees.

Kris


Received on Tue Jan 25 2005 - 21:38:35 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:26 UTC