Now I am going to provide the second informational part about the new BSD sort: the differences between the new BSD sort and the GNU sort programs. Below: NBSD - new BSD sort; OGNU - old BSD/GNU sort, version 5.3.0; NGNU - new GNU sort, version 8.15. There are several areas: 1) -k option (sort fields/keys), with single byte locales. The -k option functionality is described in POSIX standard with deceitfully simple wording. Unfortunately, the real meaning is rather complex, and sometimes it defies the common sense. For example, a sort key can spread across boundaries with subsequent sort keys. A common-sense-based implementation would fail the POSIX standard. The OGNU implementation fails to follow the standard. In our tests, there are thousands various use cases when it does not work properly. The OGNU exact algorithm and the failure pattern is a mystery for us. The NGNU implements the POSIX algorithm properly, and the NBSD is compatible with NGNU when using -k in single-byte locales. 2) -k option (sort fields/keys), with multi-byte locales. Here the situation is reverse. During the -k calculation, NGNU basically ignores the symbol sizes (but it does not ignore the symbol sizes when it compares the fields). The result is that the sort keys are not calculated correctly by the NGNU. OGNU, on the other hand, seems to use the correct symbol sizes, but it is still using incorrect algorithm. So, the overall picture is: - OGNU uses wrong algorithm, but it uses correct symbol sizes; - NGNU uses correct algorithm, but it uses wrong symbol sizes; - NBSD uses NGNU-compatible algorithm with correct symbol sizes. So, for big majority of users who are using only single-byte locales, the NBSD sort key calculation is compatible with NGNU. For the multi-byte locales (like zh_HK.Big5HKSCS), we believe that only NBSD provides the correct results with -k option. 3) NGNU does not allow options -d and/or -i together with numeric sorting. OGNU and NBSD allows that. 4) NBSD adds NGNU-compatible options, which are absent in OGNU: -C (silent version of -c) -h ("human" numeric sort) -R (random sort) -V (version sort) --batch-size --compress-program --random-source --debug --files0-from 5) NBSD adds several of its own new proprietary options: --mergesort --qsort --heapsort --radixsort --nthreads=... (multi-threaded build only) 6) NBSD does not support option --parallel of NGNU. It has roughly the same meaning as --nthreads in NBSD. This difference will be fixed soon. Thanks Oleg > -----Original Message----- > From: owner-freebsd-current_at_freebsd.org [mailto:owner-freebsd- > current_at_freebsd.org] On Behalf Of Oleg Moskalenko > Sent: Wednesday, June 27, 2012 6:45 PM > To: FreeBSD Current > Subject: RE: [HEADS-UP] BSD sort is the default sort in -CURRENT > > Hi > > As promised, I am supplying an example of comparison between several > sort programs. > > The test file is a randomly generated 1,000,000 lines, each line > contain a single floating point number. > > We are going to sort it three ways - as text, as -n numeric sort, and > as -g numeric sort, with 4 programs: > 1) Old BSD/GNU sort 5.3.0 > 2) New GNU sort 8.15 > 3) New BSD sort, single threaded > 4) New BSD sort, multi-threaded > > The system is a 3-CPUs system, 1.5Gb of RAM, FreeBSD version 8.2. All > times are in seconds. Locale C. > > ============================================== > > TEXT SORT > > sys user real > Old BSD/GNU sort: 0.0 1.692 2.008 > New GNU sort: 0.0 2.279 1.605 > New BSD sort, st: 0.0 1.964 2.300 > New BSD sort, mt: 0.0 2.385 1.897 > > ============================================== > > NUMERIC SORT -n > > sys user real > Old BSD/GNU sort: 0.0 4.357 4.674 > New GNU sort: 0.0 8.839 5.134 > New BSD sort, st: 0.0 5.308 5.592 > New BSD sort, mt: 0.0 4.581 2.489 > > ============================================== > > NUMERIC SORT -g > > sys user real > Old BSD/GNU sort: 0.0 45.378 45.630 > New GNU sort: ~450 ~121 ~300 > New BSD sort, st: 0.33 4.334 5.992 > New BSD sort, mt: 11.140 4.624 8.983 > > =============================================== > > Thanks > Oleg > > > > > > _______________________________________________ > freebsd-current_at_freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current- > unsubscribe_at_freebsd.org"Received on Fri Jun 29 2012 - 18:50:37 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:28 UTC