Easter sermon: Of kernels and slocate [was: Slocate hates me (was: Prelink hates me)]

Sat Apr 10 21:51:34 UTC 2004

On Sat, 2004-04-10 at 13:03, Simon Perreault wrote:
> Ok, I found out what the problem was. The culprit is not prelink. It is 
> slocate.

Not really.  The culprit is partly that you do not understand how
filesystems and memory management work in Linux (after all, why should
you?), and partly that they do not behave as a non-hacker would expect. 
Slocate is a symptom, not a problem.  Let me explain in nauseating
detail :-)

The trillion-foot view of filesystems and memory management

When a program like slocate is run, it scans directories and files. 
When it scans directories, the kernel caches inodes (file metadata) in
memory, and it also caches things called dentries, which are just bits
of data that let the kernel figure out more quickly whether a directory
does or doesn't contain a file.

Neither inodes nor dentries are accounted as belonging to any particular
process, because they are system-wide.  However, there are limits on
their sizes, and you can see how much space they're using by examining
the file /proc/slabinfo.  Use the commands "grep inode /proc/slabinfo"
and "grep dentry /proc/slabinfo" before and after you do an slocate run
to see what kind of an effect it has.

In addition, slocate reads the contents of every file.  The kernel also
caches these contents in the page cache, in case they're needed again. 
Once again, most of the page cache typically isn't accounted as
belonging to any specific process.

All of the dentry, inode, and page cache data that get used during an
slocate run subtract from the amount of memory that the kernel reports
as free, even though the kernel will shrink the sizes of those caches
dynamically if it needs to.

Because the kernel will shrink these caches dynamically, the value
reported for "free" by utilities like top is pretty much completely
worthless.  It doesn't tell you how much free memory there is, because
there's loads of memory being used up by caches.  In normal usage, you
even *want* all that memory to be used by caches.

So this is why a lack of understanding of the kernel's memory management
is confusing you.  There's nothing wrong with not understanding that;
most people don't need to care.  Now on to the next bit.

Bad behaviour by the kernel

Tools like "slocate" are pathological in their behaviour for caches. 
The cache is meant to help when you need to access the same data again
and again, which is very common.  However, slocate reads every single
file in a filesystem once, so the caches balloon up in size with data
that they don't need to contain.  This causes "memory pressure", which
results in the kernel shrinking caches that might contain useful data,
and paging out parts of programs that you'll probably want in the
morning.  Which is why Linux desktop boxes are usually sluggish first
thing in the morning - they have to read stuff back in off disk.

A few of the kernel maintainers don't like this behaviour, because it
surprises users who don't read kernel source for fun, and such surprises
are bad form.  These people want to see the current long-standing
behaviour changed so that you don't pay a performance penalty later for
running tools like slocate, but we're not there yet.

Parting shot

Now that you know that the kernel sizes its caches dynamically, you know
why your memory allocator causes the amount of free memory to jump; it
forces the kernel to shrink those caches.  You also know that you don't
actually want to have much free memory, because truly free memory
reduces the caching that the kernel can do to speed your system up.

Thus endeth the Easter sermon.

	<b