tl;dr: Python 3.12 should be built with no-omit-frame-pointer if upstream recommends it.
Absolutely not, because…
Apparently there are some benchmarks that make Python look extra slow when the flags are turned on
… considering those benchmarks, Python is one of the programs for which it would be the *least* appropriate to enable frame pointers!
which I don't quite understand.
So then please try to figure it out. As long as the performance hit is as big as it is, enabling frame pointers is *not* acceptable.
Meanwhile, on the upstream side, Python 3.12 (due next year, main Python for Fedora 39 [3]) has support for `perf`. Upstream plans to recommend compiling with these flags when measuring performance [4], and AFAIK, the plan is to recommend *always* compiling with them. [3]: https://fedoraproject.org/wiki/Changes/Python3.12 [4]: https://docs.python.org/3.12/howto/perf_profiling.html#how-to-obtain-the-bes...
Looking at the details in the link, would it not be sufficient to have the perf trampolines compiled with frame pointers enabled (which will only affect runs with perf support enabled ar runtime)? Or actually, since they are "compiled on the fly", do those trampolines not always have a frame pointer anyway, no matter how Python is compiled?
The idea is that possible speedups from "allowing anyone to profile/optimize their workflow" are worth the initial slowdown.
I and others heavily dispute this claim. There is no evidence that anybody will be doing enough profiling and optimization to compensate for the up to 10% performance hit seen on performance-critical Python code in the benchmarks, nor that such big optimization is even possible to begin with. Also, profiling and optimizing individual Python programs (as opposed to the Python interpreter itself) will not help the vast majority of Python code out there at all.
As far as I can see, performance geeks are enthusiastic for `perf` support,
As you say yourself, this is a niche feature for "performance geeks".
and I'd like to get them to (continue to) use Fedora builds.
That is in theory a laudable goal, but not if it degrades the performance for everyone else.
If CPython upstream does recommend these flags (or makes them default), I'm considering to turn the no-omit options on for Python 3.12 even if Fedora as a whole doesn't.
Then I can only hope that FESCo will explicitly disallow that.
Note that even a 2% slowdown will likely won back by general performance improvements – the Faster CPython team is targeting a 20% average speedup for pure-Python code in 3.12, on top of the ~25% for 3.11. And the people responsible for this speedup have a say in the upstream recommendations.
Those 20% are a stated goal, not a guarantee. Even if attained, they will also not necessarily help all programs the same amount. And most importantly, Fedora will benefit from those upstream optimizations either way, even if it does not ship a build optimized for profiling rather than performance. So I do not see this as an argument for shipping Python with frame pointers.
I am also very sceptical of that "grabbing for the stars" approach of accepting a 2-10% performance loss in the hopes of getting a 20% performance win that may or may not be actually achievable. In German, we have a proverb that literally translates to: "Better the sparrow in the hand than the dove on the roof." It means that it is better to take the small thing that you already have than to throw it away for a bigger thing that you have yet to catch.
Technically there are three separate places where the flags can be set, I think we should turn them on everywhere:
- CPython itself & its standard library
- The debug build (/usr/bin/python3-debug)
- Default for libraries in Fedora (RPM macros)
- Befault for libraries built by users (sysconfig settings)
IMHO, the debug build is the only one where it might possibly make sense to do so. Anything else would hurt the performance for real-world users who are not interested in profiling at all.
Kevin Kofler