Igor Gnatenko wrote:
1. Lower requirement to something like SSE4 and select other CPU
features which are available in most of CPUs for last decade.
Sorry, but -1 to SSE4 too. One of my machines supports only up to SSSE3, and
other replies in this thread have also suggested SSSE3 as the most we can
assume. And if you ask me, we should just stick to SSE2 as the baseline.
What are the big gains to be had from SSE3, SSSE3, SSE4.1, and SSE4.2?
Especially if you limit it to packages that don't do runtime detection?
(Performance-sensitive software SHOULD do runtime detection, and most of it
does, e.g., OpenBLAS.)
2. Build every package on x86_64 twice (one for compatible set and
one
for this new-features set), possibly by introducting sub-architecture
in koji or using koji-shadow (that's just implementation detail.
Produce an official spin which is built from these packages.
That would at least be tolerable, but still, I'm against it. It sounds like
a huge waste of resources for very little practical gain to me.
3. Invent some mechanism for selecting appropriate feature set in
runtime (somebody mentioned fat binaries in this thread).
We already have 2 such mechanisms:
* several upstream software packages check CPUID directly. See, e.g., how
OpenBLAS does it. Or the performance-sensitive parts of Chromium. Etc.
* you can drop optimized builds of entire shared objects (.so) into an
appropriate subdirectory of %{_libdir}. Some profiles such as haswell are
already supported. If we need more, they can be added.
So I don't see a need for fat binaries.
Kevin Kofler