Extract file from tar home/

Wed Jun 9 18:13:17 UTC 2010

On Wed, 2010-06-09 at 00:39 -0430, Patrick O'Callaghan wrote:
> On Tue, 2010-06-08 at 21:55 -0500, Robert Nichols wrote:
> > There is no way other than linear search to find a file in a tar
> > archive, so tar always has to read** from the beginning of the archive
> > until it comes to a file you want.
> 
> IIRC for uncompressed tarballs this is not strictly the case. The layout
> is basically metadata-file-metadata-file-... so it's possible to seek
> over intermediate files in strides (as the metadata includes the file
> size). For compressed tarballs of course this won't work, which is what
> I was trying to say.
> 
> > Even after it has extracted everything you
> > asked for, tar will continue to the end of the archive looking for a
> > possible later version of one of the files you wanted, appended with a
> > --concatenate operation after the original archive was created.
> 
> A good point which I hadn't considered.

On further reflexion and a close reading of "info tar", we find the
following:

`--occurrence[=NUMBER]'
     This option can be used in conjunction with one of the subcommands
     `--delete', `--diff', `--extract' or `--list' when a list of files
     is given either on the command line or via `-T' option.

     This option instructs `tar' to process only the NUMBERth
     occurrence of each named file.  NUMBER defaults to 1, so

          tar -x -f archive.tar --occurrence filename

     will extract the first occurrence of the member `filename' from
     `archive.tar' and will terminate without scanning to the end of
     the archive.

So in case the under discussion (i.e. when there's no possibility of an
updated version of the target appearing later in the tarball), the best
approach would be:

tar xf tarball.tar.gz --ocurrence target

thus avoiding the unnecessary processing of the rest of the tarball.

poc