Features/ArchitectureSupport - changing what we build for

Callum Lerwick seg at haxxed.com
Tue Feb 3 20:29:10 UTC 2009


On Tue, 2009-02-03 at 01:01 -0500, Gregory Maxwell wrote:
> We would see much more substantial gains from things like -msse2 &
> fpmath=sse but, unfortunately, unlike i586 there are a *lot* of
> systems out there (and still being sold) which do not have all the
> fancy instruction set extensions.

I've found -mfpmath=sse to actually be slightly slower than x87. GCC
just isn't very good at SSE yet, but people have been tuning its x87
output for decades now.

And at this point, all really performance critical bits of code I've
ever seen, are already using runtime selection of hand-tuned SSE/MMX/etc
inner loops. This is absolutely key. There's little gain to be had from
diddling with GCC's instruction set usage because most performance
critical software is already using hand-tuned assembly in their hotpaths
on CPUs that support them.

Going -O3 rather than -O2 is going to make a bigger difference than
anything else. If you want to improve performance, you need to run
profiles, locate performance critical bits of code, figure out if -O3 is
beneficial, and/or write some hand tuned assembly/intrinsic code.

Not to mention, the biggest performance problem on modern processors is
memory. Minimizing cache thrashing is way more important than what
instructions you use. Optimize data structures before code.

> Repeating myself here… When it comes down to it, we already have a
> "performance compiled" distribution: x86_64 which has every i686 knob
> turned on and then some. If you care about whatever really minute gain
> you'd get out of arch=i686, then you really should be looking for an
> x86_64 system. (The higher end atoms are quite attractive I hear…)

Yes, most new hardware has been 64-bit capable for years now. There's
little reason to cut off i586 users, when x86-64 already provides a
convenient cutoff point.

My advice as an amateur optimization specialist: Don't bother with
-march=i686, there's just not enough gain over i586 vs the cost.

And actually I find -march=i586 questionable, is the one additional
instruction something GCC is ever going to use on its own? But i486 is
so ancient I'm not going to cry if we dump it. :)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : http://lists.fedoraproject.org/pipermail/devel/attachments/20090203/7dc520cf/attachment.bin 


More information about the devel mailing list