RE: [HEADS-UP] BSD sort is the default sort in -CURRENT

From: Oleg Moskalenko <oleg.moskalenko_at_citrix.com>
Date: Wed, 27 Jun 2012 02:09:48 -0700
Doug, I'll post some performance figures, probably tomorrow.

But I do not agree with you that we have to reproduce the old sort bugs.
It makes no sense and I am not going to do that. Absolutely not.

If some old scripts are relying on buggy behavior 
(and I hope they are not) then the old scripts must be fixed. Period.
The system cannot grow replicating the old bugs.

All system scripts that I've seen are using pretty basic sort features. In the basic
area, the old sort and the new sort are 100% compatible. The incompatibilities are 
in more complex areas (numeric sorts and unusual key-based sorts).

I am actually tested the new sort against the old GNU sort. There are some incompatibilities. 
All of them are due to the bugs of the old GNU sort. The new BSD sort program
is compatible with the new GNU sort, a much cleaner program than the old GNU sort.

Try to install the new GNU coreutils. If the scripts can work with the new GNU sort 
(version 8.15 and later) than they will work with the new BSD sort.

There is a POSIX standard, and the program must be compatible with the POSIX standard.

Take care,
Oleg

> -----Original Message-----
> From: Doug Barton [mailto:dougb_at_FreeBSD.org]
> Sent: Wednesday, June 27, 2012 1:35 AM
> To: Oleg Moskalenko
> Cc: Gabor Kovesdan; FreeBSD Current
> Subject: Re: [HEADS-UP] BSD sort is the default sort in -CURRENT
> 
> On 06/26/2012 11:48 PM, Oleg Moskalenko wrote:
> >
> >
> >> -----Original Message----- From: Doug Barton
> >> [mailto:dougb_at_FreeBSD.org] Sent: Tuesday, June 26, 2012 11:18 PM
> >> To: Gabor Kovesdan Cc: FreeBSD Current; Oleg Moskalenko Subject:
> >> Re: [HEADS-UP] BSD sort is the default sort in -CURRENT
> >>
> >> On 06/26/2012 11:04 PM, Gabor Kovesdan wrote:
> >>> Hi Folks,
> >>>
> >>> as I announced before, the default sort in -CURRENT has been
> >>> changed to BSD sort.
> >>
> >> Has this been performance tested vs. the old one? If so, where are
> >> the results?
> >
> > Of course it was performance-tested.
> 
> Great, can you post the results somewhere? I understand what you're
> saying below that there are situations where worse performance may need
> explanation, but it would be helpful if we had the data to look at.
> 
> > As this is a totally different
> > program with different algorithms, it has totally different
> > performance profile on different tests, comparing to the old sort. In
> > the default compilation mode (single thread sort) the performance is
> > on the same level as the old sort (sometimes faster, sometimes
> > slower). The new sort is often significantly faster in numeric sort
> > tests. In "experimental" multi-threading mode, the new sort is much
> > faster than the old sort on multi-CPU systems.
> 
> This sounds encouraging. Is there a knob to enable the threaded build?
> 
> > The sort speed comparison is not actually fair because the old sort
> > cuts some corners and has a number of bugs.
> 
> Understood, but the existing sort is what we're changing away from, so
> that's what we have to test against. What we don't want is a situation
> where we are switching to the new sort by default without understanding
> what the tradeoffs are. (IOW, we don't want a repeat of the situation
> with grep.)
> 
> > The concrete figures do not have much sense because you change the
> > sort file and you get a totally different performance ratio.
> 
> I'm assuming that you'd run the performance tests on various different
> input files, and report the different results.
> 
> >> Has this been thoroughly regression-tested against the old
> >> version, option by option?
> >
> > Of course we have the regression tests. We have an overnight test
> > that runs through probably 17 millions various sort option
> > combinations.
> 
> This sounds great, but ...
> 
> > But we actually had to compare the new sort against a
> > fresh GNU sort implementation (version 8.15), because the old BSD GNU
> > sort is very buggy and testing against the old GNU sort has no
> > sense.
> 
> ... this not so much. The existing sort is what people have now, and
> what they rely on, particularly for scripts. Comparing apples to
> oranges
> doesn't help us understand how things are going to be different with
> the
> new version.
> 
> Doug
Received on Wed Jun 27 2012 - 07:09:51 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:28 UTC