Awk and sort (of text files)

Stephen Davies sdavies at sdc.com.au
Mon Jun 29 02:21:22 UTC 2015


On 29/06/15 11:08, jd1008 wrote:
>
>
> On 06/28/2015 07:02 PM, Stephen Davies wrote:
>>
>> On 29/06/15 10:13, jd1008 wrote:
>>>
>>>
>>> On 06/28/2015 06:38 PM, jd1008 wrote:
>>>> Hi,
>>>> I have text files made of paragraphs of text, separated by
>>>> blank lines.
>>>>
>>>> Each "paragraph" is information about a different item.
>>>>
>>>> In need to sort these paragraphs based on the first line
>>>> of each paragraph.
>>>>
>>>> Need some hints how to accomplish this.
>>>>
>>>> Thanx.
>>> Forgot to say that each paragraph is made of multiple lines,
>>> but a paragraph's lines do not contain a blank line.
>> I would just concatenate lines until the blank is reached then write out the
>> concatenated line.
>> The result can then be sorted.
>>
>> If you want to revert the result to paragraphs, just reverse the process
>> outputting lines of up to N characters ending in a space.
>>
>> HTH,
>> Stephen
>>
> Too much work to break the one line back into multiple lines because the lines
> are of different lengths.
> Too many files also. Also, to keep original lines of a paragraph unmangled, I
> would have to first
> do something like append each line of a paragraph with a delineating character
> to be used by something
> like sed to change that character into a newline.

Adding a "line separator" to each input line is a trivial extension to simply 
concatenating.

The number of input files is irrelevant; just loop through them all as 
awk/gawk inputs and combine the outputs using >> and then sort.

If you also want to recreate the original file structure at the end, you could 
also add a "file name separator" and a file name to the end of each 
concatenated line.

Feed the sorted output back into awk/gawk to rebuild the files but that, of 
course, destroys the sort sequence.



More information about the users mailing list