On 5/9/23 21:27, DJ Delorie wrote:
Jarek Prokop<jprokop(a)redhat.com> writes:
> Are the libffi/rebuilt packages available anywhere for us to
> experiment with?
MPB uses COPR, so..
"before"
builds:https://copr.fedorainfracloud.org/coprs/djdelorie/libffi-3.4.4.che...
"after"
builds:https://copr.fedorainfracloud.org/coprs/djdelorie/libffi-3.4.4/
Great! Thanks for the links, I've done manual testing with the reproducer:
~~~
require 'fiddle/closure'
require 'fiddle/struct'
Fiddle::Closure.new(Fiddle::TYPE_VOID, [])
fork { }
GC.start
~~~
And with the copr build of libffi 3.4.4-3 , Ruby indeed no longer
crashes on this code.
> We have a reasonably reliable reproducer in Ruby [0] (also included in
> commit message [1]), but it is not executed as part of test suite,
Yes, fork-without-exec case is a known "that should never have worked"
case that only happens to work when your closure's backing store is also
forked, which file-based mappings are *not*. You need either really old
(rwx mmap, which security disables) or really new (static trampolines,
which are r-x/rw- mmap'ed) libffi to support that. Hopefully that means
your reproducers should not reproduce with the new libffi.
> Moreover, rebuild with current Ruby specfiles won't tell you much as
> we commented out the tests [2] to have less flaky builds. I'd
> recommend uncommenting the lines and run 5 to 10 builds (or just run
> any of the 2 reproducers).
Well, if you comment out the tests, I have no way of knowing I broke
anything, so have to rely on posting Change Requests and letting you let
me know ;-)
And it is great to see :)
Not saying you did anything wrong; if you have tests that pass or
fail
depending on system configurations outside your control, it's difficult
to reliably test what you want to test. I'm just saying that when you
disable tests, automated processes have no insight into those failures.
We are
aware of this downside :/, we worked with relevant upstream, but
a proper fix on the side of rubygem-fiddle would require nontrivial rewrite.
(rubygem-ffi conversely has a closure pool that, AFAICT, prevents the
issue altogether.)
This was a long-running issue (read: spanning a few Fedora releases) and
doing a rebuild 10 times to have 1 not segfault and go through the rest
of the pipeline just for a teeny version rebase got really tiring.
Thanks,
Jarek Prokop