Hello packagers,
I intent to rebuild the following packages with libffi 3.4 in Rawhide side tag f36-build-side-49314 today.
The previous version remains available as libffi13.1, so failures to build will not result in uninstallable packages.
You can inspect some known failures:
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/cjs/ https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/gjs/ both "killed by signal 11 SIGSEGV"
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/hadoli... Not all RPM dependencies satisfied
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/jffi/ make: *** No rule to make target '-L/usr/lib64/../lib64', needed by '/builddir/build/BUILD/jffi-jffi-1.3.4/build/jni/jffi/Array.o'. Stop.
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/ Segmentation fault FAIL 1/1489 tests failed
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/thunde... collect2: error: ld returned 1 exit status
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/xs/ Run-time dependency ffi found: NO (tried pkgconfig and cmake)
TIP: If your package is present in c9s, consider looking how it was fixed there.
The rest of the packages I'll rebuild (passed on x86_64 in copr):
Agda alex bench brainfuck bustle cab cabal-install cabal-rpm cpphs darcs dfuzzer dhall dhall-json dl-fedora ecl fbrnch firefox gambas3 gforth ghc ghc-aeson-pretty ghc-cabal-helper ghc-clientsession ghc-criterion ghc-DAV ghc-doctest ghc-hakyll ghc-HaXml ghc-hgettext ghc-highlighting-kate ghc-hjsmin ghc-hspec-discover ghc-cheapskate ghcid ghc-libffi ghc-pretty-show ghc-servant-server ghc-vty ghc-wai-app-static ghc-wai-websockets ghc8.10 ghc9.0 ghc9.2 git-annex gitit git-repair glib2 gnustep-base gobject-introspection gtk2hs-buildtools guile guile22 guile30 happy haskell-platform hedgewars hledger hledger-ui hledger-web hlint hscolour idris jna libomp librep llvm llvm10 llvm11 llvm12 llvm7.0 llvm9.0 lsfrom lua-lgi micropython moarvm ocaml-ctypes ormolu pagure-cli pandoc patat perl-FFI-Platypus perl-Glib-Object-Introspection php pkgtreediff pygobject2 pygobject3 pypy pypy3.7 pypy3.8 python-cffi python2.7 python3.10 python3.11 python3.6 python3.7 python3.8 python3.9 p11-kit racket rhbzquery rpmbuild-order rubygem-ffi seamonkey shake ShellCheck squeak-vm tart unlambda wayland xmobar xmonad yosys
As always, please don't rebuild the package in regular rawhide until the side tag is merged.
On 08. 01. 22 10:37, Miro Hrončok wrote:
Hello packagers,
I intent to rebuild the following packages with libffi 3.4 in Rawhide side tag f36-build-side-49314 today.
The previous version remains available as libffi13.1, so failures to build will not result in uninstallable packages.
You can inspect some known failures: ...
The rest of the packages I'll rebuild (passed on x86_64 in copr):
...
As always, please don't rebuild the package in regular rawhide until the side tag is merged.
For the record, I intent to build the following packages only after the side tag is merged because they build for a looooong time:
firefox racket llvm llvm10 llvm11 llvm12 llvm7.0 llvm9.0 pypy pypy3.7 pypy3.8
Thanks to the compat package, it should not break anything.
On 1/8/22 04:37, Miro Hrončok wrote:
Hello packagers,
I intent to rebuild the following packages with libffi 3.4 in Rawhide side tag f36-build-side-49314 today.
Thank you for helping with the rebuilds!
The previous version remains available as libffi13.1, so failures to build will not result in uninstallable packages.
You can inspect some known failures:
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/cjs/ https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/gjs/ both "killed by signal 11 SIGSEGV"
Right, this is because there is an ordering dependency between gobject-introspection and cjs/gjs. You need the introspection library rebuilt first with libffi 3.4+ and then build the javascript packages, that way they both have the same library and can pass objects back and forth for introspection.
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/hadoli... Not all RPM dependencies satisfied
Agreed.
DEBUG util.py:444: No matching package to install: 'ghc-colourista-prof' DEBUG util.py:444: No matching package to install: 'ghc-ilist-prof' DEBUG util.py:444: No matching package to install: 'ghc-spdx-prof' DEBUG util.py:444: No matching package to install: 'ghc-timerep-prof' DEBUG util.py:444: Not all dependencies satisfied DEBUG util.py:444: Error: Some packages could not be found.
Rawhide did build successfully on 2021-11-30, but that was while ago and the deps have issues now.
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/jffi/ make: *** No rule to make target '-L/usr/lib64/../lib64', needed by '/builddir/build/BUILD/jffi-jffi-1.3.4/build/jni/jffi/Array.o'. Stop.
I don't know why this one fails. Passed in c9s with earlier jffi and libffi 3.4.
Rawhide did build successfully on 2021-08-22, but that was a while ago.
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/ Segmentation fault FAIL 1/1489 tests failed
This failed in the same place in two different builds in the test_ractor.rb (Ruby Ractor) test case, and it crashes in 'ractor_select()' within 'rb_vm_exec()'.
This is odd that it should fail with the libffi update since this the failure is in the Ruby Ractor test, which I wouldn't expect to use any of the FFI APIs. It has failed twice though in the same place.
I didn't see this in c9s. The last built ruby in c9s was built by me and it is 3.0.2-155, where test_ractor.rb passes just fine built with libffi 3.4.
The ruby-mri binary has no deep DT_NEEDED dependencies which should need libffi or other libraries to be built in a particular order, but with dlopen you can get odd ordering issues that are only resolved after the SONAME bump is complete and rebuilds completed across dependent libraries.
Rawhide did build successfully on 2021-12-10.
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/thunde... collect2: error: ld returned 1 exit status
The logs don't contain any more information. This is a static linker failure when building libxul.so.
Rawhide did build successfully on 2021-12-15.
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/xs/ Run-time dependency ffi found: NO (tried pkgconfig and cmake)
Run-time dependency ffi found: NO (tried pkgconfig and cmake) Library ffi found: YES ^^^^^^^^^^^^^^^^^^^^^^ Run-time dependency gc found: NO (tried pkgconfig and cmake) Library gc found: YES Library gccpp found: YES Run-time dependency readline found: YES 8.1 Program touch found: YES (/usr/bin/touch) Program ../generators/buildinfo.sh found: NO meson.build:24:2: ERROR: Program '../generators/buildinfo.sh' not found or not executable ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A full log can be found at /builddir/build/BUILD/XS-789540c5f208b8e8f07fc81c3bec3d0ee47c6dea/build/meson-logs/meson-log.txt
xs has been FTBS since July 2021: https://koji.fedoraproject.org/koji/buildinfo?buildID=1805437
TIP: If your package is present in c9s, consider looking how it was fixed there.
The rest of the packages I'll rebuild (passed on x86_64 in copr):
Agda alex bench brainfuck bustle cab cabal-install cabal-rpm cpphs darcs dfuzzer dhall dhall-json dl-fedora ecl fbrnch firefox gambas3 gforth ghc ghc-aeson-pretty ghc-cabal-helper ghc-clientsession ghc-criterion ghc-DAV ghc-doctest ghc-hakyll ghc-HaXml ghc-hgettext ghc-highlighting-kate ghc-hjsmin ghc-hspec-discover ghc-cheapskate ghcid ghc-libffi ghc-pretty-show ghc-servant-server ghc-vty ghc-wai-app-static ghc-wai-websockets ghc8.10 ghc9.0 ghc9.2 git-annex gitit git-repair glib2 gnustep-base gobject-introspection gtk2hs-buildtools guile guile22 guile30 happy haskell-platform hedgewars hledger hledger-ui hledger-web hlint hscolour idris jna libomp librep llvm llvm10 llvm11 llvm12 llvm7.0 llvm9.0 lsfrom lua-lgi micropython moarvm ocaml-ctypes ormolu pagure-cli pandoc patat perl-FFI-Platypus perl-Glib-Object-Introspection php pkgtreediff pygobject2 pygobject3 pypy pypy3.7 pypy3.8 python-cffi python2.7 python3.10 python3.11 python3.6 python3.7 python3.8 python3.9 p11-kit racket rhbzquery rpmbuild-order rubygem-ffi seamonkey shake ShellCheck squeak-vm tart unlambda wayland xmobar xmonad yosys
As always, please don't rebuild the package in regular rawhide until the side tag is merged.
On 08. 01. 22 18:09, Carlos O'Donell wrote:
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/cjs/ https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/gjs/ both "killed by signal 11 SIGSEGV"
Right, this is because there is an ordering dependency between gobject-introspection and cjs/gjs. You need the introspection library rebuilt first with libffi 3.4+ and then build the javascript packages, that way they both have the same library and can pass objects back and forth for introspection.
Indeed. Second build works nicely. I was sure I've tried that, but apparently not.
Dne 08. 01. 22 v 18:09 Carlos O'Donell napsal(a):
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/ Segmentation fault FAIL 1/1489 tests failed
This failed in the same place in two different builds in the test_ractor.rb (Ruby Ractor) test case, and it crashes in 'ractor_select()' within 'rb_vm_exec()'.
This is odd that it should fail with the libffi update since this the failure is in the Ruby Ractor test, which I wouldn't expect to use any of the FFI APIs. It has failed twice though in the same place.
I didn't see this in c9s. The last built ruby in c9s was built by me and it is 3.0.2-155, where test_ractor.rb passes just fine built with libffi 3.4.
The ruby-mri binary has no deep DT_NEEDED dependencies which should need libffi or other libraries to be built in a particular order, but with dlopen you can get odd ordering issues that are only resolved after the SONAME bump is complete and rebuilds completed across dependent libraries.
Rawhide did build successfully on 2021-12-10.
This is reported upstream:
https://bugs.ruby-lang.org/issues/18412
Just keep trying and it will eventually pass.
Vít
On 10. 01. 22 10:56, Vít Ondruch wrote:
Dne 08. 01. 22 v 18:09 Carlos O'Donell napsal(a):
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/ Segmentation fault FAIL 1/1489 tests failed
This failed in the same place in two different builds in the test_ractor.rb (Ruby Ractor) test case, and it crashes in 'ractor_select()' within 'rb_vm_exec()'.
This is odd that it should fail with the libffi update since this the failure is in the Ruby Ractor test, which I wouldn't expect to use any of the FFI APIs. It has failed twice though in the same place.
I didn't see this in c9s. The last built ruby in c9s was built by me and it is 3.0.2-155, where test_ractor.rb passes just fine built with libffi 3.4.
The ruby-mri binary has no deep DT_NEEDED dependencies which should need libffi or other libraries to be built in a particular order, but with dlopen you can get odd ordering issues that are only resolved after the SONAME bump is complete and rebuilds completed across dependent libraries.
Rawhide did build successfully on 2021-12-10.
This is reported upstream:
https://bugs.ruby-lang.org/issues/18412
Just keep trying and it will eventually pass.
OK then, running:
while ! \fedpkg build --fail-fast; do sleep 5; done
On 10. 01. 22 15:33, Miro Hrončok wrote:
On 10. 01. 22 10:56, Vít Ondruch wrote:
Dne 08. 01. 22 v 18:09 Carlos O'Donell napsal(a):
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/ Segmentation fault FAIL 1/1489 tests failed
This failed in the same place in two different builds in the test_ractor.rb (Ruby Ractor) test case, and it crashes in 'ractor_select()' within 'rb_vm_exec()'.
This is odd that it should fail with the libffi update since this the failure is in the Ruby Ractor test, which I wouldn't expect to use any of the FFI APIs. It has failed twice though in the same place.
I didn't see this in c9s. The last built ruby in c9s was built by me and it is 3.0.2-155, where test_ractor.rb passes just fine built with libffi 3.4.
The ruby-mri binary has no deep DT_NEEDED dependencies which should need libffi or other libraries to be built in a particular order, but with dlopen you can get odd ordering issues that are only resolved after the SONAME bump is complete and rebuilds completed across dependent libraries.
Rawhide did build successfully on 2021-12-10.
This is reported upstream:
https://bugs.ruby-lang.org/issues/18412
Just keep trying and it will eventually pass.
OK then, running:
while ! \fedpkg build --fail-fast; do sleep 5; done
I have stopped now, with 11th build running. You will eventually need to rebuild ruby for https://fedoraproject.org/wiki/Changes/Ruby_3.1 anyway.
Dne 10. 01. 22 v 18:07 Miro Hrončok napsal(a):
On 10. 01. 22 15:33, Miro Hrončok wrote:
On 10. 01. 22 10:56, Vít Ondruch wrote:
Dne 08. 01. 22 v 18:09 Carlos O'Donell napsal(a):
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/
Segmentation fault FAIL 1/1489 tests failed
This failed in the same place in two different builds in the test_ractor.rb (Ruby Ractor) test case, and it crashes in 'ractor_select()' within 'rb_vm_exec()'.
This is odd that it should fail with the libffi update since this the failure is in the Ruby Ractor test, which I wouldn't expect to use any of the FFI APIs. It has failed twice though in the same place.
I didn't see this in c9s. The last built ruby in c9s was built by me and it is 3.0.2-155, where test_ractor.rb passes just fine built with libffi 3.4.
The ruby-mri binary has no deep DT_NEEDED dependencies which should need libffi or other libraries to be built in a particular order, but with dlopen you can get odd ordering issues that are only resolved after the SONAME bump is complete and rebuilds completed across dependent libraries.
Rawhide did build successfully on 2021-12-10.
This is reported upstream:
https://bugs.ruby-lang.org/issues/18412
Just keep trying and it will eventually pass.
OK then, running:
while ! \fedpkg build --fail-fast; do sleep 5; done
I have stopped now, with 11th build running. You will eventually need to rebuild ruby for https://fedoraproject.org/wiki/Changes/Ruby_3.1 anyway.
Well, but I am afraid the failures are likely different then the original one. Please see:
https://koschei.fedoraproject.org/package/ruby?collection=f36
All the buildroot changes seems to be related to FFI. Also scratch builds are failing in some strange way:
https://src.fedoraproject.org/rpms/ruby/pull-request/106
I am not really sure what to blame.
Vít
Dne 10. 01. 22 v 18:18 Vít Ondruch napsal(a):
Dne 10. 01. 22 v 18:07 Miro Hrončok napsal(a):
On 10. 01. 22 15:33, Miro Hrončok wrote:
On 10. 01. 22 10:56, Vít Ondruch wrote:
Dne 08. 01. 22 v 18:09 Carlos O'Donell napsal(a):
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/
Segmentation fault FAIL 1/1489 tests failed
This failed in the same place in two different builds in the test_ractor.rb (Ruby Ractor) test case, and it crashes in 'ractor_select()' within 'rb_vm_exec()'.
This is odd that it should fail with the libffi update since this the failure is in the Ruby Ractor test, which I wouldn't expect to use any of the FFI APIs. It has failed twice though in the same place.
I didn't see this in c9s. The last built ruby in c9s was built by me and it is 3.0.2-155, where test_ractor.rb passes just fine built with libffi 3.4.
The ruby-mri binary has no deep DT_NEEDED dependencies which should need libffi or other libraries to be built in a particular order, but with dlopen you can get odd ordering issues that are only resolved after the SONAME bump is complete and rebuilds completed across dependent libraries.
Rawhide did build successfully on 2021-12-10.
This is reported upstream:
https://bugs.ruby-lang.org/issues/18412
Just keep trying and it will eventually pass.
OK then, running:
while ! \fedpkg build --fail-fast; do sleep 5; done
I have stopped now, with 11th build running. You will eventually need to rebuild ruby for https://fedoraproject.org/wiki/Changes/Ruby_3.1 anyway.
Well, but I am afraid the failures are likely different then the original one. Please see:
https://koschei.fedoraproject.org/package/ruby?collection=f36
All the buildroot changes seems to be related to FFI. Also scratch builds are failing in some strange way:
https://src.fedoraproject.org/rpms/ruby/pull-request/106
I am not really sure what to blame.
Vít
Some progress. So there is e.g. this failure:
~~~
1) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]: [ruby-core:86410] [Bug #14634]. Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 3249430 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil.
~~~
Trying the test on itself, it passes:
~~~
$ make test-all TESTS="test/ruby/test_autoload.rb -v -n /TestAutoload#test_autoload_fork/" ./revision.h unchanged Run options: --seed=15497 "--ruby=./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name=!/memory_leak/ -v -n /TestAutoload#test_autoload_fork/
# Running tests:
[1/0] TestAutoload#test_autoload_fork = 0.31 s Finished tests in 0.316094s, 3.1636 tests/s, 18.9817 assertions/s. 1 tests, 6 assertions, 0 failures, 0 errors, 0 skips
ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]
~~~
However, trying together with Fiddle tests (Fiddle is the FFI wrapper in Ruby), it fails:
~~~
$ make test-all TESTS="test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n /TestAutoload#test_autoload_fork/" ./revision.h unchanged Run options: --seed=15 "--ruby=./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name=!/memory_leak/ -v -n /TestAutoload#test_autoload_fork/
# Running tests:
[1/0] TestAutoload#test_autoload_fork = 0.42 s
1) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]: [ruby-core:86410] [Bug #14634]. Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 10859 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil.
2) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/tool/lib/zombie_hunter.rb:6]: Expected [[10858, #<Process::Status: pid 10858 exit 0>], [10860, #<Process::Status: pid 10860 SIGABRT (signal 6) (core dumped)>]] to be empty.
Finished tests in 0.423522s, 2.3612 tests/s, 9.4446 assertions/s. 1 tests, 4 assertions, 2 failures, 0 errors, 0 skips
ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux] make: *** [uncommon.mk:822: yes-test-all] Aborted (core dumped)
~~~
However, it is still not clear what is wrong.
Vít
Dne 11. 01. 22 v 17:21 Vít Ondruch napsal(a):
Dne 10. 01. 22 v 18:18 Vít Ondruch napsal(a):
Dne 10. 01. 22 v 18:07 Miro Hrončok napsal(a):
On 10. 01. 22 15:33, Miro Hrončok wrote:
On 10. 01. 22 10:56, Vít Ondruch wrote:
Dne 08. 01. 22 v 18:09 Carlos O'Donell napsal(a):
> https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/ > > Segmentation fault > FAIL 1/1489 tests failed This failed in the same place in two different builds in the test_ractor.rb (Ruby Ractor) test case, and it crashes in 'ractor_select()' within 'rb_vm_exec()'.
This is odd that it should fail with the libffi update since this the failure is in the Ruby Ractor test, which I wouldn't expect to use any of the FFI APIs. It has failed twice though in the same place.
I didn't see this in c9s. The last built ruby in c9s was built by me and it is 3.0.2-155, where test_ractor.rb passes just fine built with libffi 3.4.
The ruby-mri binary has no deep DT_NEEDED dependencies which should need libffi or other libraries to be built in a particular order, but with dlopen you can get odd ordering issues that are only resolved after the SONAME bump is complete and rebuilds completed across dependent libraries.
Rawhide did build successfully on 2021-12-10.
This is reported upstream:
https://bugs.ruby-lang.org/issues/18412
Just keep trying and it will eventually pass.
OK then, running:
while ! \fedpkg build --fail-fast; do sleep 5; done
I have stopped now, with 11th build running. You will eventually need to rebuild ruby for https://fedoraproject.org/wiki/Changes/Ruby_3.1 anyway.
Well, but I am afraid the failures are likely different then the original one. Please see:
https://koschei.fedoraproject.org/package/ruby?collection=f36
All the buildroot changes seems to be related to FFI. Also scratch builds are failing in some strange way:
https://src.fedoraproject.org/rpms/ruby/pull-request/106
I am not really sure what to blame.
Vít
Some progress. So there is e.g. this failure:
1) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]: [ruby-core:86410] [Bug #14634]. Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 3249430 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil.Trying the test on itself, it passes:
$ make test-all TESTS="test/ruby/test_autoload.rb -v -n /TestAutoload#test_autoload_fork/" ./revision.h unchanged Run options: --seed=15497 "--ruby=./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name=!/memory_leak/ -v -n /TestAutoload#test_autoload_fork/ # Running tests: [1/0] TestAutoload#test_autoload_fork = 0.31 s Finished tests in 0.316094s, 3.1636 tests/s, 18.9817 assertions/s. 1 tests, 6 assertions, 0 failures, 0 errors, 0 skips ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]However, trying together with Fiddle tests (Fiddle is the FFI wrapper in Ruby), it fails:
$ make test-all TESTS="test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n /TestAutoload#test_autoload_fork/" ./revision.h unchanged Run options: --seed=15 "--ruby=./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name=!/memory_leak/ -v -n /TestAutoload#test_autoload_fork/ # Running tests: [1/0] TestAutoload#test_autoload_fork = 0.42 s 1) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]: [ruby-core:86410] [Bug #14634]. Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 10859 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil. 2) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/tool/lib/zombie_hunter.rb:6]: Expected [[10858, #<Process::Status: pid 10858 exit 0>], [10860, #<Process::Status: pid 10860 SIGABRT (signal 6) (core dumped)>]] to be empty. Finished tests in 0.423522s, 2.3612 tests/s, 9.4446 assertions/s. 1 tests, 4 assertions, 2 failures, 0 errors, 0 skips ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux] make: *** [uncommon.mk:822: yes-test-all] Aborted (core dumped)However, it is still not clear what is wrong.
Vít
Going further, I was able to minimize the test_import.rb:
~~~
# coding: US-ASCII # frozen_string_literal: true begin require_relative 'helper' require 'fiddle/import' rescue LoadError end
module Fiddle module LIBC extend Importer dlload LIBC_SO, LIBM_SO
CallCallback = bind("void call_callback(void*, void*)"){ | ptr1, ptr2| f = Function.new(ptr1.to_i, [TYPE_VOIDP], TYPE_VOID) f.call(ptr2) } end
end if defined?(Fiddle)
~~~
Where the `CallCallback` makes the difference.
Vít
I get somewhere:
~~~
$ gdb --args ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name='!/memory_leak/' test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n '/TestAutoload#test_autoload_fork/' GNU gdb (GDB) Fedora 11.1-6.fc36 Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.
For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./miniruby... warning: File "/builddir/build/BUILD/ruby-3.1.0/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load". To enable execution of this file add add-auto-load-safe-path /builddir/build/BUILD/ruby-3.1.0/.gdbinit line to your configuration file "/builddir/.config/gdb/gdbinit". To completely disable this security protection add set auto-load safe-path / line to your configuration file "/builddir/.config/gdb/gdbinit". For more information about this security protection see the "Auto-loading safe path" section in the GDB manual. E.g., run from the shell: info "(gdb)Auto-loading safe path" (gdb) r Starting program: /builddir/build/BUILD/ruby-3.1.0/miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test/runner.rb --ruby=./miniruby\ -I./lib\ -I.\ -I.ext/common\ \ ./tool/runruby.rb\ --extout=.ext\ \ --\ --disable-gems --excludes-dir=./test/excludes --name=!/memory_leak/ test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n /TestAutoload#test_autoload_fork/ Download failed: No route to host. Continuing without debug info for /builddir/build/BUILD/ruby-3.1.0/system-supplied DSO at 0x7ffff7fc4000. Download failed: No route to host. Continuing without debug info for /lib64/libz.so.1. Download failed: No route to host. Continuing without debug info for /lib64/libgmp.so.10. Download failed: No route to host. Continuing without debug info for /lib64/libcrypt.so.2. Download failed: No route to host. Continuing without debug info for /lib64/libm.so.6. Download failed: No route to host. Continuing without debug info for /lib64/libc.so.6. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". process 13364 is executing new program: /builddir/build/BUILD/ruby-3.1.0/ruby Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-36.fc36.x86_64 gmp-6.2.1-1.fc36.x86_64 libxcrypt-4.4.27-1.fc36.x86_64 zlib-1.2.11-30.fc35.x86_64 Download failed: No route to host. Continuing without debug info for /lib64/libz.so.1. Download failed: No route to host. Continuing without debug info for /lib64/libgmp.so.10. Download failed: No route to host. Continuing without debug info for /lib64/libcrypt.so.2. Download failed: No route to host. Continuing without debug info for /lib64/libm.so.6. Download failed: No route to host. Continuing without debug info for /lib64/libc.so.6. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Run options: --seed=54837 "--ruby=./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name=!/memory_leak/ -v -n /TestAutoload#test_autoload_fork/
# Running tests:
[Detaching after vfork from child process 13401] [1/0] TestAutoload#test_autoload_fork[New Thread 0x7ffff4ccf640 (LWP 13402)] [New Thread 0x7ffff4bae640 (LWP 13403)] [New Thread 0x7ffff4a8d640 (LWP 13404)] [New Thread 0x7ffff496c640 (LWP 13405)] [New Thread 0x7ffff484b640 (LWP 13406)] [New Thread 0x7ffff472a640 (LWP 13407)] [Detaching after fork from child process 13408] [Detaching after fork from child process 13409] [Detaching after fork from child process 13410] = 0.39 s
1) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]: [ruby-core:86410] [Bug #14634]. Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 13409 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil.
2) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/tool/lib/zombie_hunter.rb:6]: Expected [[13410, #<Process::Status: pid 13410 SIGABRT (signal 6) (core dumped)>]] to be empty.
Finished tests in 0.392854s, 2.5455 tests/s, 12.7274 assertions/s. 1 tests, 5 assertions, 2 failures, 0 errors, 0 skips
ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]
Thread 1 "ruby" received signal SIGABRT, Aborted. 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-36.fc36.x86_64 gmp-6.2.1-1.fc36.x86_64 libxcrypt-4.4.27-1.fc36.x86_64 zlib-1.2.11-30.fc35.x86_64 (gdb) bt #0 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007ffff785a656 in raise () from /lib64/libc.so.6 #2 0x00007ffff7844833 in abort () from /lib64/libc.so.6 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350 #4 0x00007ffff4d190b1 in dealloc (ptr=0x5555558c1c00) at /builddir/build/BUILD/ruby-3.1.0/ext/fiddle/closure.c:32 #5 0x00007ffff7cb7801 in run_final (zombie=140737300557440, objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4011 #6 finalize_list (objspace=objspace@entry=0x55555555d800, zombie=140737300557440) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4030 #7 0x00007ffff7cb80cc in rb_objspace_call_finalizer (objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4194 #8 0x00007ffff7ca56eb in rb_ec_finalize (ec=0x55555555dd70) at /builddir/build/BUILD/ruby-3.1.0/eval.c:164 #9 rb_ec_cleanup (ec=ec@entry=0x55555555dd70, ex0=<optimized out>) at /builddir/build/BUILD/ruby-3.1.0/eval.c:256 #10 0x00007ffff7ca5c14 in ruby_run_node (n=0x7ffff7699660) at /builddir/build/BUILD/ruby-3.1.0/eval.c:321 #11 0x000055555555518f in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:47 (gdb) f 3 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350 4350 USAGE_ERROR_ACTION(fm, p); (gdb) l 4345 check_free_chunk(fm, p); 4346 goto postaction; 4347 } 4348 } 4349 erroraction: 4350 USAGE_ERROR_ACTION(fm, p); 4351 postaction: 4352 POSTACTION(fm); 4353 } 4354 }
~~~
Vít
Dne 11. 01. 22 v 17:26 Vít Ondruch napsal(a):
Dne 11. 01. 22 v 17:21 Vít Ondruch napsal(a):
Dne 10. 01. 22 v 18:18 Vít Ondruch napsal(a):
Dne 10. 01. 22 v 18:07 Miro Hrončok napsal(a):
On 10. 01. 22 15:33, Miro Hrončok wrote:
On 10. 01. 22 10:56, Vít Ondruch wrote:
Dne 08. 01. 22 v 18:09 Carlos O'Donell napsal(a): > >> https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/ruby/ >> >> Segmentation fault >> FAIL 1/1489 tests failed > This failed in the same place in two different builds in the > test_ractor.rb (Ruby Ractor) > test case, and it crashes in 'ractor_select()' within > 'rb_vm_exec()'. > > This is odd that it should fail with the libffi update since > this the failure is in the > Ruby Ractor test, which I wouldn't expect to use any of the FFI > APIs. It has failed > twice though in the same place. > > I didn't see this in c9s. The last built ruby in c9s was built > by me and it is 3.0.2-155, > where test_ractor.rb passes just fine built with libffi 3.4. > > The ruby-mri binary has no deep DT_NEEDED dependencies which > should need libffi or other > libraries to be built in a particular order, but with dlopen you > can get odd ordering > issues that are only resolved after the SONAME bump is complete > and rebuilds completed > across dependent libraries. > > Rawhide did build successfully on 2021-12-10. >
This is reported upstream:
https://bugs.ruby-lang.org/issues/18412
Just keep trying and it will eventually pass.
OK then, running:
while ! \fedpkg build --fail-fast; do sleep 5; done
I have stopped now, with 11th build running. You will eventually need to rebuild ruby for https://fedoraproject.org/wiki/Changes/Ruby_3.1 anyway.
Well, but I am afraid the failures are likely different then the original one. Please see:
https://koschei.fedoraproject.org/package/ruby?collection=f36
All the buildroot changes seems to be related to FFI. Also scratch builds are failing in some strange way:
https://src.fedoraproject.org/rpms/ruby/pull-request/106
I am not really sure what to blame.
Vít
Some progress. So there is e.g. this failure:
1) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]: [ruby-core:86410] [Bug #14634]. Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 3249430 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil.Trying the test on itself, it passes:
$ make test-all TESTS="test/ruby/test_autoload.rb -v -n /TestAutoload#test_autoload_fork/" ./revision.h unchanged Run options: --seed=15497 "--ruby=./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name=!/memory_leak/ -v -n /TestAutoload#test_autoload_fork/ # Running tests: [1/0] TestAutoload#test_autoload_fork = 0.31 s Finished tests in 0.316094s, 3.1636 tests/s, 18.9817 assertions/s. 1 tests, 6 assertions, 0 failures, 0 errors, 0 skips ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]However, trying together with Fiddle tests (Fiddle is the FFI wrapper in Ruby), it fails:
$ make test-all TESTS="test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n /TestAutoload#test_autoload_fork/" ./revision.h unchanged Run options: --seed=15 "--ruby=./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name=!/memory_leak/ -v -n /TestAutoload#test_autoload_fork/ # Running tests: [1/0] TestAutoload#test_autoload_fork = 0.42 s 1) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]: [ruby-core:86410] [Bug #14634]. Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 10859 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil. 2) Failure: TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/tool/lib/zombie_hunter.rb:6]: Expected [[10858, #<Process::Status: pid 10858 exit 0>], [10860, #<Process::Status: pid 10860 SIGABRT (signal 6) (core dumped)>]] to be empty. Finished tests in 0.423522s, 2.3612 tests/s, 9.4446 assertions/s. 1 tests, 4 assertions, 2 failures, 0 errors, 0 skips ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux] make: *** [uncommon.mk:822: yes-test-all] Aborted (core dumped)However, it is still not clear what is wrong.
Vít
Going further, I was able to minimize the test_import.rb:
# coding: US-ASCII # frozen_string_literal: true begin require_relative 'helper' require 'fiddle/import' rescue LoadError end module Fiddle module LIBC extend Importer dlload LIBC_SO, LIBM_SO CallCallback = bind("void call_callback(void*, void*)"){ | ptr1, ptr2| f = Function.new(ptr1.to_i, [TYPE_VOIDP], TYPE_VOID) f.call(ptr2) } end end if defined?(Fiddle)Where the `CallCallback` makes the difference.
Vít
On 1/11/22 13:45, Vít Ondruch wrote:
Thread 1 "ruby" received signal SIGABRT, Aborted. 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-36.fc36.x86_64 gmp-6.2.1-1.fc36.x86_64 libxcrypt-4.4.27-1.fc36.x86_64 zlib-1.2.11-30.fc35.x86_64 (gdb) bt #0 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007ffff785a656 in raise () from /lib64/libc.so.6 #2 0x00007ffff7844833 in abort () from /lib64/libc.so.6 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350
This is libffi's internal allocator detecting an inconsistency.
#4 0x00007ffff4d190b1 in dealloc (ptr=0x5555558c1c00) at /builddir/build/BUILD/ruby-3.1.0/ext/fiddle/closure.c:32
This is fiddle calling ffi_closure_free() because USE_FFI_CLOSURE_ALLOC is non-zero.
The original closure was allocated in allocate() in fiddle.
What happened to the closure between allocation and free?
Does the memory location change?
Does something corrupt the closure?
#5 0x00007ffff7cb7801 in run_final (zombie=140737300557440, objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4011 #6 finalize_list (objspace=objspace@entry=0x55555555d800, zombie=140737300557440) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4030 #7 0x00007ffff7cb80cc in rb_objspace_call_finalizer (objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4194 #8 0x00007ffff7ca56eb in rb_ec_finalize (ec=0x55555555dd70) at /builddir/build/BUILD/ruby-3.1.0/eval.c:164 #9 rb_ec_cleanup (ec=ec@entry=0x55555555dd70, ex0=<optimized out>) at /builddir/build/BUILD/ruby-3.1.0/eval.c:256 #10 0x00007ffff7ca5c14 in ruby_run_node (n=0x7ffff7699660) at /builddir/build/BUILD/ruby-3.1.0/eval.c:321 #11 0x000055555555518f in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:47 (gdb) f 3 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350 4350 USAGE_ERROR_ACTION(fm, p);
We only get here when the incoming pointer is invalid.
(gdb) l 4345 check_free_chunk(fm, p); 4346 goto postaction; 4347 } 4348 } 4349 erroraction: 4350 USAGE_ERROR_ACTION(fm, p); 4351 postaction: 4352 POSTACTION(fm); 4353 } 4354 }
I wish I had answers for you.
Nevertheless, I'd help if I knew how to debug the detached children after fork, because they are failing earlier then the main process. I was using `set follow-fork-mode child` but that does nothing :/
I think that exec or fork is the culprit, in some way.
Vít
Dne 11. 01. 22 v 22:41 Carlos O'Donell napsal(a):
On 1/11/22 13:45, Vít Ondruch wrote:
Thread 1 "ruby" received signal SIGABRT, Aborted. 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-36.fc36.x86_64 gmp-6.2.1-1.fc36.x86_64 libxcrypt-4.4.27-1.fc36.x86_64 zlib-1.2.11-30.fc35.x86_64 (gdb) bt #0 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007ffff785a656 in raise () from /lib64/libc.so.6 #2 0x00007ffff7844833 in abort () from /lib64/libc.so.6 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350
This is libffi's internal allocator detecting an inconsistency.
#4 0x00007ffff4d190b1 in dealloc (ptr=0x5555558c1c00) at /builddir/build/BUILD/ruby-3.1.0/ext/fiddle/closure.c:32
This is fiddle calling ffi_closure_free() because USE_FFI_CLOSURE_ALLOC is non-zero.
The original closure was allocated in allocate() in fiddle.
What happened to the closure between allocation and free?
Does the memory location change?
Does something corrupt the closure?
#5 0x00007ffff7cb7801 in run_final (zombie=140737300557440, objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4011 #6 finalize_list (objspace=objspace@entry=0x55555555d800, zombie=140737300557440) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4030 #7 0x00007ffff7cb80cc in rb_objspace_call_finalizer (objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4194 #8 0x00007ffff7ca56eb in rb_ec_finalize (ec=0x55555555dd70) at /builddir/build/BUILD/ruby-3.1.0/eval.c:164 #9 rb_ec_cleanup (ec=ec@entry=0x55555555dd70, ex0=<optimized out>) at /builddir/build/BUILD/ruby-3.1.0/eval.c:256 #10 0x00007ffff7ca5c14 in ruby_run_node (n=0x7ffff7699660) at /builddir/build/BUILD/ruby-3.1.0/eval.c:321 #11 0x000055555555518f in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:47 (gdb) f 3 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350 4350 USAGE_ERROR_ACTION(fm, p);
We only get here when the incoming pointer is invalid.
(gdb) l 4345 check_free_chunk(fm, p); 4346 goto postaction; 4347 } 4348 } 4349 erroraction: 4350 USAGE_ERROR_ACTION(fm, p); 4351 postaction: 4352 POSTACTION(fm); 4353 } 4354 }
So as I already mentioned, the following fails:
~~~
$ ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name='!/memory_leak/' test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n '/TestAutoload#test_autoload_fork/'
~~~
While the command on itself is not really comprehensible, the main executable is the runruby.rb file, which essentially sets some environment followed by `exec` [1]. I believe it execs something similar to the following command:
~~~
$ LD_LIBRARY_PATH=. ./ruby -I./lib -I. -I./tool/lib -I.ext/common --disable-gems -rtest/fiddle/test_import.rb -rtest/ruby/test_autoload.rb -e '' -- -v -n '/TestAutoload#test_autoload_fork/'
~~~
But the command above succeeds. So it must be some strange interaction due to `exec` call?
Vít
[1] https://github.com/ruby/ruby/blob/v3_1_0/tool/runruby.rb#L181
Dne 12. 01. 22 v 18:36 Vít Ondruch napsal(a):
I wish I had answers for you.
Nevertheless, I'd help if I knew how to debug the detached children after fork, because they are failing earlier then the main process. I was using `set follow-fork-mode child` but that does nothing :/
I think that exec or fork is the culprit, in some way.
Vít
Dne 11. 01. 22 v 22:41 Carlos O'Donell napsal(a):
On 1/11/22 13:45, Vít Ondruch wrote:
Thread 1 "ruby" received signal SIGABRT, Aborted. 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-36.fc36.x86_64 gmp-6.2.1-1.fc36.x86_64 libxcrypt-4.4.27-1.fc36.x86_64 zlib-1.2.11-30.fc35.x86_64 (gdb) bt #0 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007ffff785a656 in raise () from /lib64/libc.so.6 #2 0x00007ffff7844833 in abort () from /lib64/libc.so.6 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350
This is libffi's internal allocator detecting an inconsistency.
#4 0x00007ffff4d190b1 in dealloc (ptr=0x5555558c1c00) at /builddir/build/BUILD/ruby-3.1.0/ext/fiddle/closure.c:32
This is fiddle calling ffi_closure_free() because USE_FFI_CLOSURE_ALLOC is non-zero.
The original closure was allocated in allocate() in fiddle.
What happened to the closure between allocation and free?
Does the memory location change?
Does something corrupt the closure?
#5 0x00007ffff7cb7801 in run_final (zombie=140737300557440, objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4011 #6 finalize_list (objspace=objspace@entry=0x55555555d800, zombie=140737300557440) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4030 #7 0x00007ffff7cb80cc in rb_objspace_call_finalizer (objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4194 #8 0x00007ffff7ca56eb in rb_ec_finalize (ec=0x55555555dd70) at /builddir/build/BUILD/ruby-3.1.0/eval.c:164 #9 rb_ec_cleanup (ec=ec@entry=0x55555555dd70, ex0=<optimized out>) at /builddir/build/BUILD/ruby-3.1.0/eval.c:256 #10 0x00007ffff7ca5c14 in ruby_run_node (n=0x7ffff7699660) at /builddir/build/BUILD/ruby-3.1.0/eval.c:321 #11 0x000055555555518f in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:47 (gdb) f 3 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350 4350 USAGE_ERROR_ACTION(fm, p);
We only get here when the incoming pointer is invalid.
(gdb) l 4345 check_free_chunk(fm, p); 4346 goto postaction; 4347 } 4348 } 4349 erroraction: 4350 USAGE_ERROR_ACTION(fm, p); 4351 postaction: 4352 POSTACTION(fm); 4353 } 4354 }
I have reported this here:
https://bugzilla.redhat.com/show_bug.cgi?id=2040380
And I'd appreciate any help to figure out what is going on.
Vít
Dne 12. 01. 22 v 19:33 Vít Ondruch napsal(a):
So as I already mentioned, the following fails:
$ ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" --excludes-dir=./test/excludes --name='!/memory_leak/' test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n '/TestAutoload#test_autoload_fork/'While the command on itself is not really comprehensible, the main executable is the runruby.rb file, which essentially sets some environment followed by `exec` [1]. I believe it execs something similar to the following command:
$ LD_LIBRARY_PATH=. ./ruby -I./lib -I. -I./tool/lib -I.ext/common --disable-gems -rtest/fiddle/test_import.rb -rtest/ruby/test_autoload.rb -e '' -- -v -n '/TestAutoload#test_autoload_fork/'But the command above succeeds. So it must be some strange interaction due to `exec` call?
Vít
[1] https://github.com/ruby/ruby/blob/v3_1_0/tool/runruby.rb#L181
Dne 12. 01. 22 v 18:36 Vít Ondruch napsal(a):
I wish I had answers for you.
Nevertheless, I'd help if I knew how to debug the detached children after fork, because they are failing earlier then the main process. I was using `set follow-fork-mode child` but that does nothing :/
I think that exec or fork is the culprit, in some way.
Vít
Dne 11. 01. 22 v 22:41 Carlos O'Donell napsal(a):
On 1/11/22 13:45, Vít Ondruch wrote:
Thread 1 "ruby" received signal SIGABRT, Aborted. 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-36.fc36.x86_64 gmp-6.2.1-1.fc36.x86_64 libxcrypt-4.4.27-1.fc36.x86_64 zlib-1.2.11-30.fc35.x86_64 (gdb) bt #0 0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007ffff785a656 in raise () from /lib64/libc.so.6 #2 0x00007ffff7844833 in abort () from /lib64/libc.so.6 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350
This is libffi's internal allocator detecting an inconsistency.
#4 0x00007ffff4d190b1 in dealloc (ptr=0x5555558c1c00) at /builddir/build/BUILD/ruby-3.1.0/ext/fiddle/closure.c:32
This is fiddle calling ffi_closure_free() because USE_FFI_CLOSURE_ALLOC is non-zero.
The original closure was allocated in allocate() in fiddle.
What happened to the closure between allocation and free?
Does the memory location change?
Does something corrupt the closure?
#5 0x00007ffff7cb7801 in run_final (zombie=140737300557440, objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4011 #6 finalize_list (objspace=objspace@entry=0x55555555d800, zombie=140737300557440) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4030 #7 0x00007ffff7cb80cc in rb_objspace_call_finalizer (objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4194 #8 0x00007ffff7ca56eb in rb_ec_finalize (ec=0x55555555dd70) at /builddir/build/BUILD/ruby-3.1.0/eval.c:164 #9 rb_ec_cleanup (ec=ec@entry=0x55555555dd70, ex0=<optimized out>) at /builddir/build/BUILD/ruby-3.1.0/eval.c:256 #10 0x00007ffff7ca5c14 in ruby_run_node (n=0x7ffff7699660) at /builddir/build/BUILD/ruby-3.1.0/eval.c:321 #11 0x000055555555518f in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:47 (gdb) f 3 #3 0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350 4350 USAGE_ERROR_ACTION(fm, p);
We only get here when the incoming pointer is invalid.
(gdb) l 4345 check_free_chunk(fm, p); 4346 goto postaction; 4347 } 4348 } 4349 erroraction: 4350 USAGE_ERROR_ACTION(fm, p); 4351 postaction: 4352 POSTACTION(fm); 4353 } 4354 }
On 08. 01. 22 18:09, Carlos O'Donell wrote:
On 1/8/22 04:37, Miro Hrončok wrote:
Hello packagers,
I intent to rebuild the following packages with libffi 3.4 in Rawhide side tag f36-build-side-49314 today.
Thank you for helping with the rebuilds!
The previous version remains available as libffi13.1, so failures to build will not result in uninstallable packages.
You can inspect some known failures:
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/jffi/ make: *** No rule to make target '-L/usr/lib64/../lib64', needed by '/builddir/build/BUILD/jffi-jffi-1.3.4/build/jni/jffi/Array.o'. Stop.
I don't know why this one fails. Passed in c9s with earlier jffi and libffi 3.4.
Rawhide did build successfully on 2021-08-22, but that was a while ago.
https://koschei.fedoraproject.org/package/jffi indicates this is libffi 3.4 related.
https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/package/thunde... collect2: error: ld returned 1 exit status
The logs don't contain any more information. This is a static linker failure when building libxul.so.
Rawhide did build successfully on 2021-12-15.
https://koschei.fedoraproject.org/package/thunderbird indicates this is not related.
---------
Not yet rebuilt packages:
$ repoquery -q --repo=koji --whatrequires libffi3.1 --source | pkgname gambas3 hadolint jffi llvm llvm10 llvm11 llvm12 llvm9.0 python2.7 python3.6 python3.7 ruby thunderbird xs
All are known. jffi seems to be the only libffi-related failure.
On 10. 01. 22 18:14, Miro Hrončok wrote:
On 08. 01. 22 18:09, Carlos O'Donell wrote:
On 1/8/22 04:37, Miro Hrončok wrote:
Hello packagers,
I intent to rebuild the following packages with libffi 3.4 in Rawhide...
Not yet rebuilt packages:
$ repoquery -q --repo=koji --whatrequires libffi3.1 --source | pkgname gambas3 hadolint jffi llvm llvm10 llvm11 llvm12 llvm9.0 python2.7 python3.6 python3.7 ruby thunderbird xs
All are known. jffi seems to be the only libffi-related failure.
OK, ruby also seems related:
https://koschei.fedoraproject.org/package/ruby
On 1/10/22 14:04, Miro Hrončok wrote:
On 10. 01. 22 18:14, Miro Hrončok wrote:
On 08. 01. 22 18:09, Carlos O'Donell wrote:
On 1/8/22 04:37, Miro Hrončok wrote:
Hello packagers,
I intent to rebuild the following packages with libffi 3.4 in Rawhide...
Not yet rebuilt packages:
$ repoquery -q --repo=koji --whatrequires libffi3.1 --source | pkgname gambas3 hadolint jffi llvm llvm10 llvm11 llvm12 llvm9.0 python2.7 python3.6 python3.7 ruby thunderbird xs
All are known. jffi seems to be the only libffi-related failure.
OK, ruby also seems related:
Agreed.
On 08. 01. 22 10:37, Miro Hrončok wrote:
I intent to rebuild the following packages with libffi 3.4 in Rawhide side tag f36-build-side-49314 today.
The update: https://bodhi.fedoraproject.org/updates/FEDORA-2022-c440651258
On 08. 01. 22 10:37, Miro Hrončok wrote:
Hello packagers,
I intent to rebuild the following packages with libffi 3.4 in Rawhide side tag f36-build-side-49314 today.
The previous version remains available as libffi13.1, so failures to build will not result in uninstallable packages.
You can inspect some known failures: ...
And some previously unexpected failures:
python2.7 and python3.6 ======================= test_tkinter fails on i686 this is *not* related to new libffi unfortunately koschei does not build on 32bit arches, so we have no idea how long has this been happening :( will investigate and open bugzillas on Monday
python3.7 ========= test_zlib fails on s390x not verified yet if related to libffi, but I don't think so koschei does not build on s390x either :( will investigate and open bugzilla on Monday
llvm, llvm10, llvm11, and llvm12 ================================ failures on s390x or armv7hl llvm maintainers CC'ed
gambas3 ======= broken build dependencies due to the recent libre2 unannounced soname bump I suppose I could tag in older libre2 just for the rebuild :/
On 09. 01. 22 11:07, Miro Hrončok wrote:
On 08. 01. 22 10:37, Miro Hrončok wrote:
Hello packagers,
I intent to rebuild the following packages with libffi 3.4 in Rawhide side tag f36-build-side-49314 today.
The previous version remains available as libffi13.1, so failures to build will not result in uninstallable packages.
You can inspect some known failures: ...
And some previously unexpected failures:
python2.7 and python3.6
test_tkinter fails on i686 this is *not* related to new libffi unfortunately koschei does not build on 32bit arches, so we have no idea how long has this been happening :( will investigate and open bugzillas on Monday
https://bugzilla.redhat.com/show_bug.cgi?id=2038843
python3.7
test_zlib fails on s390x not verified yet if related to libffi, but I don't think so koschei does not build on s390x either :( will investigate and open bugzilla on Monday
not related to libffi either
https://bugzilla.redhat.com/show_bug.cgi?id=2038848
gambas3
broken build dependencies due to the recent libre2 unannounced soname bump I suppose I could tag in older libre2 just for the rebuild :/
Transient issue solved, package rebuilt.