On 15. 11. 22 20:17, Kevin Kofler wrote:
tl;dr: Python 3.12 should be built with no-omit-frame-pointer if upstream recommends it.
Absolutely not, because…
Apparently there are some benchmarks that make Python look extra slow when the flags are turned on
… considering those benchmarks, Python is one of the programs for which it would be the *least* appropriate to enable frame pointers!
Oh, do you have more info on these benchmarks? The mentions I found were pretty vague. Is it DaanDeMeyer/fpbench? It seems pyperformance is the majority of those benchmarks, so there's not much to compare to. It's also the benchmark suite that upstream uses, so I'm certain any upstream recommendation will take it into account.
Meanwhile, on the upstream side, Python 3.12 (due next year, main Python for Fedora 39 [3]) has support for `perf`. Upstream plans to recommend compiling with these flags when measuring performance [4], and AFAIK, the plan is to recommend *always* compiling with them. [3]: https://fedoraproject.org/wiki/Changes/Python3.12 [4]: https://docs.python.org/3.12/howto/perf_profiling.html#how-to-obtain-the-bes...
Looking at the details in the link, would it not be sufficient to have the perf trampolines compiled with frame pointers enabled (which will only affect runs with perf support enabled ar runtime)? Or actually, since they are "compiled on the fly", do those trampolines not always have a frame pointer anyway, no matter how Python is compiled?
That sounds like a question for upstream, do you want to bring it up there? (I won't, as I don't think I could follow up well on the discussion.)
The idea is that possible speedups from "allowing anyone to profile/optimize their workflow" are worth the initial slowdown.
I and others heavily dispute this claim. There is no evidence that anybody will be doing enough profiling and optimization to compensate for the up to 10% performance hit seen on performance-critical Python code in the benchmarks, nor that such big optimization is even possible to begin with. Also, profiling and optimizing individual Python programs (as opposed to the Python interpreter itself) will not help the vast majority of Python code out there at all.
I know enough about the pyperformance benchmarks to suspect that talking about the highest slowdown is just a scare tactic. Not all benchmarks are indicative of real-world use. The one that got closest to 10% (scimark_sparse_mat_mult, with a 9.5% hit [0]) is pure-Python matrix multiplication -- perhaps the best example of code you would never put in production today.
As far as I can see, performance geeks are enthusiastic for `perf` support,
As you say yourself, this is a niche feature for "performance geeks".
and I'd like to get them to (continue to) use Fedora builds.
That is in theory a laudable goal, but not if it degrades the performance for everyone else.
If CPython upstream does recommend these flags (or makes them default), I'm considering to turn the no-omit options on for Python 3.12 even if Fedora as a whole doesn't.
Then I can only hope that FESCo will explicitly disallow that.
Note that even a 2% slowdown will likely won back by general performance improvements – the Faster CPython team is targeting a 20% average speedup for pure-Python code in 3.12, on top of the ~25% for 3.11. And the people responsible for this speedup have a say in the upstream recommendations.
Those 20% are a stated goal, not a guarantee. Even if attained, they will also not necessarily help all programs the same amount. And most importantly, Fedora will benefit from those upstream optimizations either way, even if it does not ship a build optimized for profiling rather than performance. So I do not see this as an argument for shipping Python with frame pointers.
I am also very sceptical of that "grabbing for the stars" approach of accepting a 2-10% performance loss in the hopes of getting a 20% performance win that may or may not be actually achievable. In German, we have a proverb that literally translates to: "Better the sparrow in the hand than the dove on the roof." It means that it is better to take the small thing that you already have than to throw it away for a bigger thing that you have yet to catch.
We don't “have” anything now, the Python 3.12 release is a year away. I'm pretty sure the upstream recommendation will take the then-current state into account.
In fact, in your mail I don't really see anything Fedora-specific reason to override an upstream recommendation, when it comes.
[0]: https://github.com/DaanDeMeyer/fpbench/blob/7233a6cfcd01467b429e1a22d961e3f7...