Hi,
I'm trying to compile systemd in koji and mock, and I'm getting suspicious crashes...
$ valgrind x86_64-redhat-linux-gnu/test-terminal-util /* test_default_term_for_tty */ ... /* test_read_one_char */ ==21== Invalid read of size 4 ==21== at 0x48C09EC: fputs (in /usr/lib64/libc-2.29.9000.so) ==21== by 0x109301: UnknownInlinedFun (test-terminal-util.c:43) ==21== by 0x109301: main (test-terminal-util.c:80) ==21== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==21== ==21== ==21== Process terminating with default action of signal 11 (SIGSEGV)
The problem is at this line, there is just a call to (a function which transitively calls) mkostemp(). It seems like the inlining is somehow going wrong.
Strangely, gdb also crashes: $ gdb x86_64-redhat-linux-gnu/test-terminal-util GNU gdb (GDB) Fedora 8.3.50.20190321-3.fc31 ... Reading symbols from x86_64-redhat-linux-gnu/test-terminal-util... (gdb) r Starting program: /builddir/build/BUILD/systemd-49bd196d693efe0acfc8d56c4e3d8f7ba9f91b5d/x86_64-redhat-linux-gnu/test-terminal-util Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29.9000-8.fc31.x86_64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". /* test_default_term_for_tty */ ... /* test_read_one_char */
Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7e759ec in fputs () from /lib64/libc.so.6 Segmentation fault (core dumped)
There also are compilation failures related to inlining, when I disable LTO:
In file included from ../src/basic/macro.h:549, from ../src/basic/alloc-util.h:9, from ../src/network/networkd-link.c:9: In function ‘link_enable_ipv6’, inlined from ‘link_set_mtu’ at ../src/network/networkd-link.c:1483:16: ../src/basic/log.h:104:9: error: ‘%s’ directive argument is null [-Werror=format-overflow=] 104 | log_internal_realm(LOG_REALM_PLUS_LEVEL(LOG_REALM, (level)), __VA_ARGS__) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/shared/log-link.h:21:25: note: in expansion of macro ‘log_internal’ 21 | log_internal(level, error, __FILE__, __LINE__, __func__, ##__VA_ARGS__); \ | ^~~~~~~~~~~~ ../src/shared/log-link.h:33:50: note: in expansion of macro ‘log_link_full’ 33 | #define log_link_warning_errno(link, error, ...) log_link_full(link, LOG_WARNING, error, ##__VA_ARGS__) | ^~~~~~~~~~~~~ ../src/network/networkd-link.c:324:17: note: in expansion of macro ‘log_link_warning_errno’ 324 | log_link_warning_errno(link, r, "Cannot %s IPv6 for interface %s: %m", | ^~~~~~~~~~~~~~~~~~~~~~ ../src/network/networkd-link.c: In function ‘link_set_mtu’: ../src/network/networkd-link.c:324:79: note: format string is defined here 324 | log_link_warning_errno(link, r, "Cannot %s IPv6 for interface %s: %m", | ^~
The argument is field in a structure, and when the structure is created, it is always set. It's hard to say for sure that it's never null, but I think gcc must be confused when it says it's *always* null.
The same rpm compiles fine with gcc-9.0.1-0.8.fc30, gcc-9.0.1-0.8.fc31. I'm writing to the mailing list instead of opening a bug because I'm not really sure if gcc is at fault, or if systemd code is somehow buggy in a non-obvious way... Has anyone else seen similar failures with the latest gcc build?
Zbyszek
example failed koji scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=33792874
On Wed, Mar 27, 2019 at 01:55:44PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
Hi,
I'm trying to compile systemd in koji and mock, and I'm getting suspicious crashes...
$ valgrind x86_64-redhat-linux-gnu/test-terminal-util /* test_default_term_for_tty */ ... /* test_read_one_char */ ==21== Invalid read of size 4 ==21== at 0x48C09EC: fputs (in /usr/lib64/libc-2.29.9000.so) ==21== by 0x109301: UnknownInlinedFun (test-terminal-util.c:43) ==21== by 0x109301: main (test-terminal-util.c:80) ==21== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==21== ==21== ==21== Process terminating with default action of signal 11 (SIGSEGV)
The problem is at this line, there is just a call to (a function which transitively calls) mkostemp(). It seems like the inlining is somehow going wrong.
It turns out that our test case was wrong. I was confused because the inlining causes the backtrace to report an unrelated spot.
Strangely, gdb also crashes: $ gdb x86_64-redhat-linux-gnu/test-terminal-util GNU gdb (GDB) Fedora 8.3.50.20190321-3.fc31 ... Reading symbols from x86_64-redhat-linux-gnu/test-terminal-util... (gdb) r Starting program: /builddir/build/BUILD/systemd-49bd196d693efe0acfc8d56c4e3d8f7ba9f91b5d/x86_64-redhat-linux-gnu/test-terminal-util Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29.9000-8.fc31.x86_64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". /* test_default_term_for_tty */ ... /* test_read_one_char */
Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7e759ec in fputs () from /lib64/libc.so.6 Segmentation fault (core dumped)
This is still a problem. gdb crashes on any program in rawhide mock for me right now. But gcc seems to be fine.
Zbyszek
On Thu, Mar 28, 2019 at 08:52:18AM +0000, Zbigniew Jędrzejewski-Szmek wrote:
On Wed, Mar 27, 2019 at 01:55:44PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
I'm trying to compile systemd in koji and mock, and I'm getting suspicious crashes...
$ valgrind x86_64-redhat-linux-gnu/test-terminal-util /* test_default_term_for_tty */ ... /* test_read_one_char */ ==21== Invalid read of size 4 ==21== at 0x48C09EC: fputs (in /usr/lib64/libc-2.29.9000.so) ==21== by 0x109301: UnknownInlinedFun (test-terminal-util.c:43) ==21== by 0x109301: main (test-terminal-util.c:80) ==21== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==21== ==21== ==21== Process terminating with default action of signal 11 (SIGSEGV)
The problem is at this line, there is just a call to (a function which transitively calls) mkostemp(). It seems like the inlining is somehow going wrong.
It turns out that our test case was wrong. I was confused because the inlining causes the backtrace to report an unrelated spot.
So do you still need anything from me to debug? gdb crashes I'll defer to the gdb team. Is that with LTO only btw?
Jakub
On Thu, Mar 28, 2019 at 02:14:31PM +0100, Jakub Jelinek wrote:
On Thu, Mar 28, 2019 at 08:52:18AM +0000, Zbigniew Jędrzejewski-Szmek wrote:
On Wed, Mar 27, 2019 at 01:55:44PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
I'm trying to compile systemd in koji and mock, and I'm getting suspicious crashes...
$ valgrind x86_64-redhat-linux-gnu/test-terminal-util /* test_default_term_for_tty */ ... /* test_read_one_char */ ==21== Invalid read of size 4 ==21== at 0x48C09EC: fputs (in /usr/lib64/libc-2.29.9000.so) ==21== by 0x109301: UnknownInlinedFun (test-terminal-util.c:43) ==21== by 0x109301: main (test-terminal-util.c:80) ==21== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==21== ==21== ==21== Process terminating with default action of signal 11 (SIGSEGV)
The problem is at this line, there is just a call to (a function which transitively calls) mkostemp(). It seems like the inlining is somehow going wrong.
It turns out that our test case was wrong. I was confused because the inlining causes the backtrace to report an unrelated spot.
So do you still need anything from me to debug?
Thanks. I need some advice mostly. There's still the question of bogus backtrace returned by valgrind. Is this a valgrind issue or the debug data produced by gdb or something else? If we cannot rely on backtraces with LTO, this would be a big drawback.
gdb crashes I'll defer to the gdb team. Is that with LTO only btw?
No, LTO doesn't seem to be relevant, despite what I said earlier. With some programs (I tried a few, some crash, so don't, no idea what is the rule, but it seems that the very simple ones don't):
In mock buildroot of systemd: $ ninja -C x86_64-redhat-linux-gnu systemd $ gdb x86_64-redhat-linux-gnu/systemd GNU gdb (GDB) Fedora 8.3.50.20190321-3.fc31 ... $ r ... Trying to run as user instance, but the system has not been booted with systemd. [Inferior 1 (process 2466) exited with code 01] Segmentation fault (core dumped)
So the crash seems to be when returning to the gdb prompt, either because the debugee exited or crashed or hit a breakpoint (all three end the same).
Zbyszek
Hi,
On Thu, 2019-03-28 at 14:28 +0000, Zbigniew Jędrzejewski-Szmek wrote:
On Thu, Mar 28, 2019 at 02:14:31PM +0100, Jakub Jelinek wrote:
On Thu, Mar 28, 2019 at 08:52:18AM +0000, Zbigniew Jędrzejewski-Szmek wrote:
On Wed, Mar 27, 2019 at 01:55:44PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
I'm trying to compile systemd in koji and mock, and I'm getting suspicious crashes...
$ valgrind x86_64-redhat-linux-gnu/test-terminal-util /* test_default_term_for_tty */ ... /* test_read_one_char */ ==21== Invalid read of size 4 ==21== at 0x48C09EC: fputs (in /usr/lib64/libc-2.29.9000.so) ==21== by 0x109301: UnknownInlinedFun (test-terminal-util.c:43) ==21== by 0x109301: main (test-terminal-util.c:80) ==21== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==21== ==21== ==21== Process terminating with default action of signal 11 (SIGSEGV)
The problem is at this line, there is just a call to (a function which transitively calls) mkostemp(). It seems like the inlining is somehow going wrong.
It turns out that our test case was wrong. I was confused because the inlining causes the backtrace to report an unrelated spot.
So do you still need anything from me to debug?
Thanks. I need some advice mostly. There's still the question of bogus backtrace returned by valgrind. Is this a valgrind issue or the debug data produced by gdb or something else? If we cannot rely on backtraces with LTO, this would be a big drawback.
The above backtrace is produced by valgrind. The addresses should be correct, but as "UnknownInlinedFun" shows it has some trouble resolving the associated function/symbol names.
I don't know if LTO makes that valgrind bug worse.
If gdb works then you can also use gdb and valgrind together: https://tromey.com/blog/?p=731
http://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserv...
gdb probably can produce a better backtrace than valgrind.