I've recently benchmarked the FC4 kernel (2.6.11-1.1369_FC4) against
vanilla 2.6.11.12 using UnixBench. I have to confess that the
substantial performance gap surprised me (the vanilla kernel showed a
hard to ignore 62% performance gain over the Fedora default) so I tried
to dig into it and possibly identify the culprit: re-ran the tests with
the Fedora kernel recompiled in different configurations.
The tests were run on a 1.7GHz P4 and the configuration tags are:
* orig: the original kernel as shipped with FC4
* nodebug: kernel debugging options disabled
* p4: processor family set to Pentium 4
* nose: NSA SE Linux options disabled
* nohm: high memory support turned off
* lean: minimal configuration, matching the test hw
The complete results (there's an Open Solaris test in there too, feel
free to ignore;) :
http://lufs.sourceforge.net/unixbench.html
The final scores:
1. Linux 2.6.11.12 vanilla (nodebug+p4+nose+nohm+lean): 345.8
2. Linux 2.6.11-1.1369_FC4 (nodebug+p4+nose+nohm+lean): 269.3
3. Linux 2.6.11-1.1369_FC4 (nodebug+p4+nose+ nohm): 253.1
4. Linux 2.6.11-1.1369_FC4 (nodebug+p4): 239.4
5. Linux 2.6.11-1.1369_FC4 (nodebug): 236.7
6: Linux 2.6.11-1.1369_FC4 (orig): 213.2
7: SunOS 5.11 (orig): 122.3
(note that #1 & #2 were built with identical configurations)
From this I can infer 2 overhead components:
- one related to the features enabled in the original kernel
configuration, which account for the difference 6<->2
- one that seems to be introduced by the FC kernel patch set,
responsible for the 2<->1 difference
Regarding the configuration component, I can understand why certain
features and the overhead associated with them are preferred vs raw
kernel performance. OTOH, leaving 62% on the table makes me feel uneasy.
Do I really need high mem, SE Linux or a debug-enabled kernel on my
desktop? Don't think so. But I do want the kernel preemption enabled...
My point is: with so many kernel features, "one size fits all" doesn't
hold anymore and maybe we should have a much broader array of kernels to
choose from at install time (not just architecture/SMP variants). This
should be fairly easy to support as it's just a matter of adding new
build configurations in the kernel SRPM/spec.
Regarding the second overhead component, there's still a serious
performance gap between the FC4 kernel and its vanilla correspondent
even when built with identical configurations. This points the finger at
the FC4 kernel patches that obviously have a big impact on performance.
The system call overhead in particular seems way off, I remember a
discussion about exec shield/NX disabling vsyscall and thus hurting P4s
big time - is this still the case?
Anyway, I wanted to share these results with you and raise awareness on
the kernel performance issue. It would be a shame for FC to get a
slow/bloated OS reputation just because nobody noticed that feature
creep is killing its performance ;).
Cheers,
Florin