Extract file from tar home/

Robert Nichols rnicholsNOSPAM at comcast.net
Wed Jun 9 02:55:17 UTC 2010


On 06/08/2010 07:46 PM, Patrick O'Callaghan wrote:
> On Tue, 2010-06-08 at 16:36 -0700, Suvayu Ali wrote:
>>> To extract only a specific file:
>>> $ tar xf bobg.tar.gz the/file/you/want
>>>
>>
>> I think the OP's worry is not whether it can be done, but he wants to
>> avoid the time and CPU cycles involved in the gunzip step. Since he
>> has his entire home directory in the tarball, even extracting a single
>> file requires tar to decompress the entire tarball before it can
>> extract that one file.
>
> Without looking at the source code one can't be sure, but I'd be
> surprised if that were literally true. IOW I doubt that tar decompresses
> everything to a temp file and then searches for the target. It should
> only be necessary to decompress up to the position of the target file
> (recall that tar originally meant "tape archiver", i.e. it's very
> focussed on doing things in a single sequential pass). I assumed that
> was what the OP wanted to do, and I believe the solution offered
> achieves it. Furthermore, given that the Gzip compression algorithm is
> stream-based, there cannot by definition be any substantially more
> efficient way of doing it other than decompressing the stream up to the
> point of interest.
>
> The alternative archive method (compress each file and then tar up the
> result), would be somewhat easier to extract in this specific case, but
> at the cost of a poorer compression ratio. It's the old space/time
> tradeoff once again.

There is no way other than linear search to find a file in a tar archive,
so tar always has to read** from the beginning of the archive until it
comes to a file you want.  Even after it has extracted everything you
asked for, tar will continue to the end of the archive looking for a
possible later version of one of the files you wanted, appended with a 
--concatenate operation after the original archive was created.  And,
while tar can't append to a compressed archive, it's entirely possible
that the archive was compressed by a separate gzip operation after the
appending was done.

** If the archive were an uncompressed ordinary file, it would be
    possible to seek over a large file that you didn't want rather than
    read every block.  I don't think tar is clever enough to do that.

-- 
Bob Nichols     "NOSPAM" is really part of my email address.
                 Do NOT delete it.



More information about the users mailing list