[PATCH] Speed up modprobe and MAKEDEV

Kyle McMartin kyle at mcmartin.ca
Wed Oct 1 17:20:38 UTC 2008


On Wed, Oct 01, 2008 at 06:52:15PM +0200, Jakub Jelinek wrote:
> Given the recent 5sec boot efforts http://lwn.net/Articles/299483/
> and Mandriva follow-ups on that http://lwn.net/Articles/300873/,
> I thought I'd share my modprobe and MAKEDEV speedup patches with
> a wider community so folks can experiment with them, especially seeing
> Mandriva folks playing with turning modprobe into a daemon because of its
> slowness.
> 

Ah, I'm glad to see others working on this! I'd not seen the Mandriva
efforts, but I should really poke at them as I'd turned modprobe into
a library and started integrating it into udev.

> MAKEDEV sources almost 500KB of config files on every invocation, and the
> inner loop is terribly inefficient.  The second patch allows it to shrink
> the files to 55KB while expressing the same info and makes the inner loop
> more efficient.  Depending on how many modprobe and MAKEDEV invocations
> are done on your box during bootup, this can or might not make meassurable
> difference.
> 

Definitely. The biggest win by far for MAKEDEV is profiling the often
hit devices, and prioritizing things. Dave Airlie moved a bunch of the
cciss and other almost never-seen devices to be sourced last and ended
up with a huge win.

> IMHO the binary caches for modprobe are better than
> having thousands of symlinks around (given 7500 wildcards where for many of
> them a few dozens of symlinks would be needed), especially if the symlinks
> would be included in rpm package, but I can be of course convinced
> otherwise.  In any case, the module-init-tools contains a bunch of speedups
> that are IMHO desirable anyway, even without the modules.{dep,alias}cache
> files.
> 

This sounds like a better idea than what I was working on, especially
since I hadn't tackled speeding up the alias issue yet. Big +1 from me
on integrating this. (It also solves a few other issues, like making
things work properly in the cpio initrd...)

Have you tested things on a big endian machine? In any case, this looks
really good. Hopefully Jon will apply this upstream soon so we can get
it in F10.

regards, Kyle

> 	Jakub

> For modules.dep depmod -a also creates modules.depcache which is just  
> a simple hash table with chains and a dumb fast compression of strings,
> so that modules.depcache is roughly 3 times smaller than modules.dep.
> The hash function already makes no difference between _ and -, so for dep
> lookups all it needs is compute the hash, walk the chain, comparing full
> 32-bit hash value and if that hits, compare also modname string, on success
> just return that and its dependencies.
> 
> For modules.alias it creates modules.aliascache, which contains a tree.
> Each tree node has 0 or more associated fnmatch wildcards that need to be
> fnmatched at that level unconditionally, then some fixed number of
> characters and prefixes of up to that length can be binary searched to
> find further node.  The further node can be of 3 types - either again
> a normal range node, or just a pointer to a result (if all chars have been
> already compared), or an entry containing remaining chars to strncmp and
> pointer to result.  Guess better is just to read the algorithm in modprobe.c
> for details.  When reading modules.alias, modprobe always calls fnmatch
> on all patterns in there and fnmatch is quite expensive.  With
> modules.aliascache, for some strings which don't have any wildcards in it,
> fnmatch isn't called at all, otherwise it is called only on wildcards where
> its prefix consisting of non-wildcard chars matches the input.
> 
> Both modules.depcache and modules.aliascache contain a file header, which
> embeds a version number, endianity and mtime/ino/size of the corresponding
> modules.{dep,alias} file - in case it is hand edited later, modprobe won't
> use the cache which will be stale in that case, until regenerated.
> 
> In addition to these changes there are some small cleanups here and there,
> e.g. as modprobe and depmod aren't threaded we can speed things up quite a
> lot by using _unlocked functions, or there is no point for every
> getline_wrapped to malloc new chunk of memory and let the caller free it
> (almost) immediately again.
> 
> vanilla, modules.dep (time to handle ~ 1600 dep searches):
> real    0m6.554s
> user    0m6.424s
> sys     0m0.128s
> 
> patched + modules.depcache
> real    0m0.042s
> user    0m0.008s
> sys     0m0.034s
> 
> vanilla, modules.alias (time to handle ~ 6400 alias searches):
> real    1m22.548s
> user    1m21.554s
> sys     0m0.965s
> 
> patched + modules.aliascache
> real    0m0.552s
> user    0m0.438s
> sys     0m0.114s
> 
> time of modprobe pci:v0000EA60d00009897svAsdBbcCscDiE
> real    0m0.034s
> user    0m0.028s
> sys     0m0.006s
> vs.
> real    0m0.007s
> user    0m0.004s
> sys     0m0.003s
> 




More information about the devel mailing list