[fedora-arm] [PATCH 1/3] rpm: always mlock the rpm database

Kedar Sovani kedars at marvell.com
Mon Jan 5 09:42:30 UTC 2009


On Mon, 2009-01-05 at 10:31 +0100, Lennert Buytenhek wrote:
> On Thu, Dec 11, 2008 at 08:59:40AM +0000, Russell King wrote:
> 
> > > > Hacky patch that mlock()s rpmdb's environment mmap(2)s, in order to
> > > > attempt to avoid spurious rpmdb corruption issues on Linux that seem
> > > > to be somehow related to pagein/pageout occuring.
> > > 
> > > Ick.
> > > 
> > > No.
> > 
> > The relevent questions are:
> > 
> > 1. which kernel version is this occuring with?
> > 
> > 2. what device is the swap on?
> > 
> > 3. which drivers are being used?
> 
> This issue goes back to May 2007 or so, when I noticed db4 corruption
> when using rpm.  I started digging into it, and ran into an issue with
> fsx-linux, which you reported to linux-arch@ here:
> 
> 	http://marc.info/?l=linux-arch&m=118026300719763&w=2
> 
> Unfortunately, the issue seen with fsx-linux turned out to be unrelated
> to the rpm db4 corruption issue.
> 
> I applied the hacky rpm db4 database mlock() patch (which was never
> meant to go upstream!) to see if that would make it go away, and it
> seems to have made it go away, since I haven't managed to reproduce
> it since and haven't had any reports about it since.
> 
> Without the mlock patch, the corruption would happen even in
> qemu-system-arm, an environment in which cache aliasing effects don't
> exist, so I abandoned the theory of it being a cache aliasing issue at
> the time and theorised that somehow a dirty page was having its dirty
> data discarded and an older stale copy being swapped back in, although
> I've never been able to prove this -- after spending a week
> unsuccessfully trying to hunt it down at the time I haven't spent any
> more time on it since.  (And everyone I mentioned this to seemed to
> agree that shared writeable mmap() is icky and yuck and booh and "hard
> to get right", and that didn't increase my motivation to look into it
> further either.)
> 
> I don't even know if it's an issue anymore in recent kernels.  I don't
> even know if it's (assuming that it _is_ indeed a kernel issue) an
> arch/arm issue or a kernel-wide issue that simply occurs more often on
> ARM because ARM systems generally have less memory and therefore
> generally have more memory pressure.  (There's certainly enough reports
> of rpm database corruption on x86 as well, but in almost every report
> there are more factors involved, such as people Ctrl-C'ing and killing
> rpm processes as they are manipulating the database, etc.)
> 

I have been running a few systems with a lot of rpm activity without
this patch, and I haven't seen a problem with these (probably because of
the rpm 4.4 to 4.6 transition?). I have taken that patch out from the
F10 rpm patches that I had submitted earlier. 

> 
> thanks,
> Lennert

Kedar.




More information about the arm mailing list