I have a buildsystem that targets a number of different distribution releases, and so I get to rebuild a root cache quite often. Quite frequently, the creation of the root cache tarball fails and causes the package build that triggered the root cache creation to fail. However, simply repeating the build invariably succeeds, and mock uses the supposedly failed cache tarball from the previous build without problems.
I've not looked at this in detail because the workaround has been so easy but yesterday I decided to take a look at it. I think there are two issues.
Firstly, the cause of the tarball creation failure. Looking at the root log, it appeared to be a change in one of the files whilst it was being archived by tar.
DEBUG util.py, Line: 234: tar: ./usr/lib/locale/locale-archive: file changed as we read it
The same problem with the same file was menioned in a report dating back two years on fedora-devel-list: http://www.redhat.com/archives/fedora-devel-list/2007-November/msg02599.html
More googling revealed a possible cause of the problem: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg190963.html
So I tried forcing a "sync" before creating the tarball and lo and behold, the problem went away. I've created at least 20 root caches since making this change and all worked fine, which I'm very confident wouldn't have been the case without the "sync". So here's the change I made:
--- /usr/lib/python2.6/site-packages/mock/plugins/root_cache.py.orig 2009-09-02 19:08:54.000000000 +0100 +++ /usr/lib/python2.6/site-packages/mock/plugins/root_cache.py 2009-11-18 15:20:04.353035160 +0000 @@ -110,6 +110,7 @@ # never rebuild cache unless it was a clean build. if self.rootObj.chrootWasCleaned: self.state("creating cache") + mock.util.do(["sync"], shell=False) mock.util.do( ["tar"] + self.compressArgs + ["-cf", self.rootCacheFile, "-C", self.rootObj.makeChrootPath(), "."],
The second problem is I think that if the "tar" process to create the tarball fails (and hence causes the resulting build to fail), the cache should be invalidated so that the next build doesn't use that presumably-broken tarball. As it happens, a faulty copy of /usr/lib/locale/locale-archive doesn't seem to cause any problems during my builds but that may just be my good fortune.
Cheers, Paul.
buildsys@lists.fedoraproject.org