On Mon, Jan 20, 2014 at 02:45:59PM +0000, Gordan Bobic wrote:
On 2014-01-01 21:09, Richard W.M. Jones wrote:
>On Wed, Jan 01, 2014 at 12:21:30PM -0800, Sean Omalley wrote:
>>They are a problem. It is a performance issue at the very least on
>>=ALL= platforms. There is a cost even on Intel's platform for
>>alignment errors, they just fix them up in hardware so it isn't as
>>big of a performance hit. It might be 5 cycles instead of 20.
>
>On Intel Sandybridge and up there is no penalty:
>
>http://www.agner.org/optimize/blog/read.php?i=142&v=t
>
>On earlier Intel processors it's not significant:
>
>http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/
>
>Anyway, you are optimizing far too early. If there's a performance
>problem, run 'perf', find out that it's caused by X where X might be
>the big misalignment penalty on ARM or many other things, then fix
>that.
I have just run the test on my Samsung Chromebook (A15) and the
results are concerning:
processing word of size 8
offset = 0
ignore this:
average time for offset 0 is 77.95
offset = 1
ignore this:
average time for offset 1 is 3465.2
offset = 2
ignore this:
average time for offset 2 is 3454.25
offset = 3
ignore this:
average time for offset 3 is 3451.2
That is 44x slower.
Is this a synthetic benchmark, or is some actual running code from
Fedora 44x slower?
I never said that fixups were free, obviously going in and out of the
kernel to emulate an instruction is going to take some time. The
question is whether it noticably affects any code.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming blog:
http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)