On Mon, Aug 04, 2003 at 17:18:58 +0300, Ruslan Ermilov wrote: > : The characters or collating elements in the > : range shall be placed in the array in ascending > : collation sequence. If the second endpoint > : precedes the starting endpoint in the collation > : sequence, it is unspecified whether the range Do you read first part about collation sequence? We just implement that, i.e. collation sequence for all, including non-POSIX locale which allowed as unspecified. > : of collating elements is empty, or this construct > : is treated as invalid. In locales other than > ^^^^^^^^^^^^^^^^^^^^^ > : the POSIX locale, this construct has unspecified > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > : behavior. > ^^^^^^^^ > > This is identical to a similar issue with awk(1), and the latest > snapshot of the One True AWK reverts to NOT using strcoll(3) to > handle character ranges in RE, because different locales and even > the same locales on different operating systems (FreeBSD, Linux, > and Solaris were compared) have different ideas about the collating > order. On Linux, the German locale's collating sequence will be > ``A a ... B b'', while on FreeBSD, it's ``A B ... a b''. This is bug in AWK, since strcoll() required in regexp, but we don't discuss AWK. Even in case it is unspecified behaviour, it means that 1) We can't use c-c for non-POSIX locales! 2) All occurances of c-c must be either replaced or used in C locale only! In other words, you win nothing, insisting on historycal behaviour, because its usage is ILLEGAL in anycase (i,e, outside of LANG=C) > So I'd rather prefer if we revert to the old behavior in tr(1). No way. The ranges should be similar with what we have for regexp.
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:17 UTC