On 09. 11. 22 12:37, Petr Viktorin wrote:
tl;dr: Python 3.12 should be built with no-omit-frame-pointer if
upstream recommends it.
Hello,
You might be aware of a Fedora change proposal [0] (discussed on
fedora-devel [1] and FESCo [2]) are discussing turning on C compiler
flags that help with performance *measurement*, but might hurt
performance itself: `-fno-omit-frame-pointer` and
`-mno-omit-leaf-frame-pointer`.
Apparently there are some benchmarks that make Python look extra slow
when the flags are turned on -- which I don't quite understand.
Update: Andrii Nakryiko looked at the disassembly:
https://pagure.io/fesco/issue/2817#comment-826636
Looks like the slowdown comes from a single function,
_PyEval_EvalFrameDefault, which essentially *is* the Python interpreter
-- so it's very unlikely to be indicative of other code in the distro
(including Python extension modules).
That function is being overhauled for Python 3.12 (in the main branch
it's now autogenerated [5], which should hopefully allow optimizations
across any common code, and perhaps things like switching to a
register-based VM [6]).
Given these massive planned changes in the affected function, I think we
should treat Python 3.11 and 3.12 as entirely separate when it comes to
performance with no-omit-frame-pointer.
And since the Python slowdown comes from a single weird function, I
think that Fedora should ignore the Python benchmarks when evaluating
the distro default -- and if Fedora switches to no-omit-frame-pointer,
Python 3.11 should be an exception (to be re-evaluated for 3.12).
(Most of the current benchmarks [7] are from Python, so more might be
needed.)
[5]:
https://github.com/python/cpython/issues/98831
[6]:
https://github.com/faster-cpython/ideas/issues/489
[7]:
https://github.com/DaanDeMeyer/fpbench
Meanwhile, on the upstream side, Python 3.12 (due next year, main Python
for Fedora 39 [3]) has support for `perf`. Upstream plans to recommend
compiling with these flags when measuring performance [4], and AFAIK,
the plan is to recommend *always* compiling with them.
The idea is that possible speedups from "allowing anyone to
profile/optimize their workflow" are worth the initial slowdown.
I'm not much of a performance expert myself, but I do get drawn into the
relevant discussions on the CPython side.
As far as I can see, performance geeks are enthusiastic for `perf`
support, and I'd like to get them to (continue to) use Fedora builds.
If CPython upstream does recommend these flags (or makes them default),
I'm considering to turn the no-omit options on for Python 3.12 even if
Fedora as a whole doesn't.
Note that even a 2% slowdown will likely won back by general performance
improvements – the Faster CPython team is targeting a 20% average
speedup for pure-Python code in 3.12, on top of the ~25% for 3.11. And
the people responsible for this speedup have a say in the upstream
recommendations.
Technically there are three separate places where the flags can be set,
I think we should turn them on everywhere:
- CPython itself & its standard library
- The debug build (/usr/bin/python3-debug)
- Default for libraries in Fedora (RPM macros)
- Befault for libraries built by users (sysconfig settings)
[0]:
https://fedoraproject.org/wiki/Changes/fno-omit-frame-pointer
[1]:
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o...
[2]:
https://pagure.io/fesco/issue/2817
[3]:
https://fedoraproject.org/wiki/Changes/Python3.12
[4]:
https://docs.python.org/3.12/howto/perf_profiling.html#how-to-obtain-the-...