problem wth sort utility?

Jakub Jelinek jakub at redhat.com
Fri Jul 1 09:29:53 UTC 2011


On Fri, Jul 01, 2011 at 07:16:29PM +1000, Cameron Simpson wrote:
> Both space and tab sort lexically before digits (which start at 48 for
> "0"). If you're using another whitespace codepoint by accident then it
> will probably come above the digits and thus change the sort result.

That's true only in LC_COLLATE=C and a couple of other locales, certainly
not true for most other locales.  Spaces/tabs and other non-alphanumeric
characters are often ignored for collation purposes, at least in the first
phase and they do make a difference only if strings without those characters
are identical.  Similarly, digits may have smaller priority than alphabetic
characters etc., it all depends on your locale.  Look at a vocabulary
for your specific language and see how things are sorted there.

If you want ASCII sorting, use
LC_ALL=C sort

	Jakub


More information about the users mailing list