On Tue, Apr 4, 2023 at 3:38 AM Zbigniew Jędrzejewski-Szmek zbyszek@in.waw.pl wrote:
On Sun, Apr 02, 2023 at 09:54:04PM +0200, Dan Čermák wrote:
The only benchmark that *I* am aware of is this one done by Martin Jambor: https://jamborm.github.io/spec-2022-07-29-levels/
This is very … underwhelming. x86-64-v2 is essentially identical to x86-64-v1. x86-64-v3 is better. It even shows speed-ups of 20%, but only with -Ofast. And -Ofast is not something that can be enabled as a default build flag, because it leads to surprising and unpredictable behaviour in some cases. (*) At -O2, which we use, the speed up is maybe 10%.
tl;dr; v2 does not really bring notable improvements, only v3 but also only in some selected synthetic benchmarks.
openSUSE Tumbleweed went a different route and chose to utilize glibc-hwcaps instead: https://en.opensuse.org/openSUSE:X86-64-Architecture-Levels https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/thread/Z...
Yeah, I think that's the way to go. I think we should identify 100 shared libraries which would be positively impacted by x86-64-v3 and provide a -v3 subrpm for them. This would be a nice feature for F40.
This is further confirmed by Arch's data that even x86_64-v3 isn't necessarily great either: https://sunnyflunk.github.io/2023/01/15/x86-64-v3-Mixed-Bag-of-Performance.h...
It seems that moving to -O3 would provide more gains than x86_64-v3.
Otherwise, the only thing you really get from moving subarches is breaking lots of hardware.
-- 真実はいつも一つ!/ Always, there's only one truth!