On 2014-01-01 21:09, Richard W.M. Jones wrote:
On Wed, Jan 01, 2014 at 12:21:30PM -0800, Sean Omalley wrote:
They are a problem. It is a performance issue at the very least on =ALL= platforms. There is a cost even on Intel's platform for alignment errors, they just fix them up in hardware so it isn't as big of a performance hit. It might be 5 cycles instead of 20.
On Intel Sandybridge and up there is no penalty:
http://www.agner.org/optimize/blog/read.php?i=142&v=t
On earlier Intel processors it's not significant:
http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-r...
Anyway, you are optimizing far too early. If there's a performance problem, run 'perf', find out that it's caused by X where X might be the big misalignment penalty on ARM or many other things, then fix that.
I have just run the test on my Samsung Chromebook (A15) and the results are concerning:
processing word of size 8 offset = 0 ignore this: average time for offset 0 is 77.95 offset = 1 ignore this: average time for offset 1 is 3465.2 offset = 2 ignore this: average time for offset 2 is 3454.25 offset = 3 ignore this: average time for offset 3 is 3451.2
That is 44x slower.
More concerningly, the counters in /proc/cpu/alignment are counting the misalignments (set to 2 - single fixup), which I thought wasn't supposed to happen on ARMv7 since the fixup is transparently happening in hardware without visibility further up.
Note: /proc/cpu/alignment is not settable to 0 (ignore) - forcing it to 0 still results in setting of 2 (fixup), and setting it to 1 (warn) results in setting of 3 (fixup+warn).
This could be a feaure of the 3.4.0 ChromeOS kernel, but if it isn't, that would imply that although the alignent does happen in hardware (and is not disablable), there is still a massive performance hit.
Is this a Samsung Exynos 5250 related bug? Or is this the expected behaviour?
I'll try to dig out a Tegra2 machine and see how that compares, but thus far it is not looking good at all.
There's no need to go on a huge crusade to fix every last mis- alignment, because that will involve vast hours of programmer effort for no measurable gain.
I am currently doing a mass rebuild and will compile a list of all packages I find that are exhibiting unaligned accesses. So far they include some important ones, such as nss.
Gordan