Awk and sort (of text files)

jd1008 jd1008 at gmail.com
Tue Jun 30 22:57:47 UTC 2015



On 06/30/2015 04:48 PM, Cameron Simpson wrote:
> On 30Jun2015 14:35, Bill Oliver <vendor at billoblog.com> wrote:
>> On Mon, 29 Jun 2015, jd1008 wrote:
>>
>>> [snip]
>>> Here is the simplest solution and it does what I want without 
>>> resorting to awk:
>>> for i in `/bin/ls -1 lists*`; do
>>> sed '/./{H;d;};x;s/\n/={NL}=/g' $i | sort | sed 
>>> '1s/={NL}=//;s/={NL}=/\n/g' > $i.sorted.txt
>>> done
>>
>> I bow before a Master.
>> So, I'm trying to parse this...
>>
>> I don't know what "NL" does.  From my reading I see the N command 
>> adds the current line to the pattern space with a newline character. 
>> I can't figure out what the "L" does, though, or if NL is a different 
>> command than "N" followed by "L"
>
> The "NL" is not a command. It is simply a piece of text to insert into 
> the line in place of newlines. (I'm not sure why - you can certainly 
> hold multiple lines in the hold space.)
>
> So the code pulls lines into the hold space and replaces the newline 
> characters with the text "NL". Then later it undoes that, replacing 
> the text "NL" with a newline character.
>
> Personally I tend to use a nontexty character for this kind of 
> placeholder, such as ^G. Less risk of excountering that in the input 
> text, and therefore less risk of accidentally mangling it.
>
> Cheers,
> Cameron Simpson <cs at zip.com.au>
>
> Don't have awk? Use this simple sh emulation:
>    #!/bin/sh
>    echo 'Awk bailing out!' >&2
>    exit 2
> - Tom Horsley <tahorsley at csd.harris.com>
Hi Cameron,
It is not only NL that newline (in Linux's case it is ^J) that it is 
being replaced with.
It is  =NL=

Thus I knew it was a simple solution for me because I knew up front my 
text files had no such content.
But I agree that for files you do not know the contents of, it is better 
to choose
a pattern that would have much less likelihood of being part of the 
file, like
==&&##:::@@@!!!

so on , so forth ....


Cheers,

JD


More information about the users mailing list