https://fedoraproject.org/wiki/Changes/DebugInfoStandardization
== Summary == Fedora 18 implemented [[Features/DwarfCompressor]]. As the format did not get widespread and the tool is not much maintained it became burden to make existing debugging tools compatible with Fedora debug info.
== Owner == * Name: [[User:jankratochvil| Jan Kratochvil ]] * Email: jan.kratochvil@redhat.com
== Detailed Description ==
Debug info files *.debug contained in *-debuginfo.rpm are very big in general (x86_64 Fedora 32 distribution has debug/ directory of 82GB while all its other files are only 75GB). There exist several methods how to make the *-debuginfo.rpms at least a bit smaller. Fedora 18 started using DWZ tool (from [[Features/DwarfCompressor]]) while [https://gcc.gnu.org/pipermail/gcc-patches/2008-August/246281.html Google implemented] the same goal in a different way called -fdebug-types-section.
Almost nobody uses existing Fedora DWZ (only Fedora/CentOS/RHEL and SuSE OSes) and so its support is missing in tools like [https://lldb.llvm.org/ LLDB], [[llvm-dwarfdumphttps://llvm.org/docs/CommandGuide/llvm-dwarfdump.html%7Cllvm-dwarfdump]] or binutils readelf. -fdebug-types-section is used internally by Google (produced by clang). Debian does not store any debug info archives. Ubuntu uses neither -fdebug-types-section nor DWZ.
* DWZ advantage: On the whole Fedora distro it saves 3.3% (5GB of the 157GB distribution size) ** If the 3.3% size increase is a concern I can implement a different optimization ([https://whova.com/embedded/session/llvm_202010/1193947/ talk (2)]) as a GCC post-processing phase which would require no changes in any DWARF consumers. * DWZ disadvantage: DWZ has currently less support across consumers (LLDB, llvm-dwarfdump, binutils readelf) * DWZ disadvantage: DWZ requires 8x times more complicated (LoC count) support in consumers than -fdebug-types-section. * DWZ disadvantage: DWZ cannot update LLVM .debug_names index which can be generated only by clang (it cannot be regenerated later for DWZ-compressed file) * DWZ disadvantage: DWZ DWARF-5 support is a work-in-progress. DWZ has been blocking DWARF-5 for Fedora for 3.5 years and only after I have now proposed to drop DWZ Mark Wielaard has started porting DWZ to DWARF-5. It can be expected next DWARF extensions will remain unsupported again. Even currently there is no plan to support DWARF-5 features used by clang which may need -fdebug-types-section for clang-built binaries or no size optimization of clang-built debug info at all. * DWZ disadvantage: Compilation (linking) requires for C++ up to 2x as big disk space (as DWZ is processing files after linker and DWZ is incompatible with -fdebug-types-section) * DWZ disadvantage: Compilation (linking) is slower
This proposed DWARF format was originally submitted already for Fedora 18 as [[Features/DebugTypesSections]].
== Benefit to Fedora == * Better compatibility with existing debugging and tracing tools, primarily [https://lldb.llvm.org/ LLDB]. * Less resource-intensive rebuilds of C++ packages (in disk space, memory requirements and compilation time).
== Scope == * Proposal owners: It affects all packages generating *-debuginfo.rpm, that is compiled (not scripted) languages. * Other developers: Report any possible debuginfo incompatibility (unexpected). * Release engineering: [https://pagure.io/releng/issues #Releng issue number] (a check of an impact with Release Engineering is needed) * Policies and guidelines: All the needed changes should be done in [https://src.fedoraproject.org/rpms/redhat-rpm-config redhat-rpm-config]. The [https://src.fedoraproject.org/rpms/dwz dwz package] can be then retired. * Trademark approval: N/A (not needed for this Change) * Alignment with Objectives: The size differences are only for *-debuginfo.rpm which is outside of scope of the listed objectives.
== Upgrade/compatibility impact == As *-debuginfo.rpm have to exactly match NVRA of its binary package the compatibility is not relevant. Existing tools supporting DWZ will still support the DWZ file format in packages which have not been rebuilt.
== How To Test == The change will update [https://src.fedoraproject.org/rpms/redhat-rpm-config redhat-rpm-config] by [https://people.redhat.com/jkratoch/redhat-rpm-config-fdebug-types-section.pa... an -fdebug-types-section patch].
Then one can use rpmbuild to rebuild a package. For mock use -a|--addrepo with modified redhat-rpm-config.rpm (with increased NVRA). For packages already rebuilt in Koji nothing is needed.
Test programs like lldb and gdb if they still can print source code, function parameters, variables etc.
One should also verify integrated testsuites of tools like clang, lldb, gcc, binutils, gdb, elfutils or rpm are not regressing with the -fdebug-types-section option.
One can also compare *.debug files built with/without DWZ and/or -fdebug-types-section using [https://src.fedoraproject.org/rpms/libabigail libabigail] utility dwdiff but that will be rather done by the change owner.
== User Experience == No user visible change. This affects what tools can developers use.
== Dependencies == none
== Contingency Plan == * Contingency mechanism: Revert the change in [https://src.fedoraproject.org/rpms/redhat-rpm-config redhat-rpm-config]. Fedora can continue using DWZ, just some debugging/tracing tools will stay incompatible. * Contingency deadline: beta freeze * Blocks release? No * Blocks product? N/A
== Documentation == * [http://www.dwarfstd.org/doc/DWARF5.pdf DWARF-5] E.2 Using Type Units * [https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#index-fdebug-types... GCC -fdebug-types-section]
On Thu, Sep 24, 2020 at 12:00 PM Ben Cotton bcotton@redhat.com wrote:
https://fedoraproject.org/wiki/Changes/DebugInfoStandardization
== Summary == Fedora 18 implemented [[Features/DwarfCompressor]]. As the format did not get widespread and the tool is not much maintained it became burden to make existing debugging tools compatible with Fedora debug info.
== Owner ==
- Name: [[User:jankratochvil| Jan Kratochvil ]]
- Email: jan.kratochvil@redhat.com
== Detailed Description ==
Debug info files *.debug contained in *-debuginfo.rpm are very big in general (x86_64 Fedora 32 distribution has debug/ directory of 82GB while all its other files are only 75GB). There exist several methods how to make the *-debuginfo.rpms at least a bit smaller. Fedora 18 started using DWZ tool (from [[Features/DwarfCompressor]]) while [https://gcc.gnu.org/pipermail/gcc-patches/2008-August/246281.html Google implemented] the same goal in a different way called -fdebug-types-section.
Almost nobody uses existing Fedora DWZ (only Fedora/CentOS/RHEL and SuSE OSes) and so its support is missing in tools like [https://lldb.llvm.org/ LLDB], [[llvm-dwarfdumphttps://llvm.org/docs/CommandGuide/llvm-dwarfdump.html%7Cllvm-dwarfdump]] or binutils readelf. -fdebug-types-section is used internally by Google (produced by clang). Debian does not store any debug info archives. Ubuntu uses neither -fdebug-types-section nor DWZ.
This is not true. Debian started producing -dbgsym packages and putting them in a separate repository years ago: https://wiki.debian.org/AutomaticDebugPackages
dwz is used by virtually all RPM based distributions now, including OpenMandriva (a clang-based distro). I know this because I implemented it. :)
I do not know whether Debian has started using dwz by default because I haven't dug into how the -dbgsym package generation works in detail.
-- 真実はいつも一つ!/ Always, there's only one truth!
On Thu, Sep 24, 2020 at 01:01:17PM -0400, Neal Gompa wrote:
On Thu, Sep 24, 2020 at 12:00 PM Ben Cotton bcotton@redhat.com wrote:
https://fedoraproject.org/wiki/Changes/DebugInfoStandardization
== Summary == Fedora 18 implemented [[Features/DwarfCompressor]]. As the format did not get widespread and the tool is not much maintained it became burden to make existing debugging tools compatible with Fedora debug info.
== Owner ==
- Name: [[User:jankratochvil| Jan Kratochvil ]]
- Email: jan.kratochvil@redhat.com
== Detailed Description ==
Debug info files *.debug contained in *-debuginfo.rpm are very big in general (x86_64 Fedora 32 distribution has debug/ directory of 82GB while all its other files are only 75GB). There exist several methods how to make the *-debuginfo.rpms at least a bit smaller. Fedora 18 started using DWZ tool (from [[Features/DwarfCompressor]]) while [https://gcc.gnu.org/pipermail/gcc-patches/2008-August/246281.html Google implemented] the same goal in a different way called -fdebug-types-section.
Almost nobody uses existing Fedora DWZ (only Fedora/CentOS/RHEL and SuSE OSes) and so its support is missing in tools like [https://lldb.llvm.org/ LLDB], [[llvm-dwarfdumphttps://llvm.org/docs/CommandGuide/llvm-dwarfdump.html%7Cllvm-dwarfdump]] or binutils readelf. -fdebug-types-section is used internally by Google (produced by clang). Debian does not store any debug info archives. Ubuntu uses neither -fdebug-types-section nor DWZ.
This is not true. Debian started producing -dbgsym packages and putting them in a separate repository years ago: https://wiki.debian.org/AutomaticDebugPackages
dwz is used by virtually all RPM based distributions now, including OpenMandriva (a clang-based distro). I know this because I implemented it. :)
I do not know whether Debian has started using dwz by default because I haven't dug into how the -dbgsym package generation works in detail.
Most of the packages that use recent versions of debhelper (the tool that automates many steps of the Debian packaging) have run dwz for the past couple of years. I do not have any statistics though.
G'luck, Peter
On Thu, Sep 24, 2020 at 1:13 PM Peter Pentchev roam@ringlet.net wrote:
On Thu, Sep 24, 2020 at 01:01:17PM -0400, Neal Gompa wrote:
On Thu, Sep 24, 2020 at 12:00 PM Ben Cotton bcotton@redhat.com wrote:
https://fedoraproject.org/wiki/Changes/DebugInfoStandardization
== Summary == Fedora 18 implemented [[Features/DwarfCompressor]]. As the format did not get widespread and the tool is not much maintained it became burden to make existing debugging tools compatible with Fedora debug info.
== Owner ==
- Name: [[User:jankratochvil| Jan Kratochvil ]]
- Email: jan.kratochvil@redhat.com
== Detailed Description ==
Debug info files *.debug contained in *-debuginfo.rpm are very big in general (x86_64 Fedora 32 distribution has debug/ directory of 82GB while all its other files are only 75GB). There exist several methods how to make the *-debuginfo.rpms at least a bit smaller. Fedora 18 started using DWZ tool (from [[Features/DwarfCompressor]]) while [https://gcc.gnu.org/pipermail/gcc-patches/2008-August/246281.html Google implemented] the same goal in a different way called -fdebug-types-section.
Almost nobody uses existing Fedora DWZ (only Fedora/CentOS/RHEL and SuSE OSes) and so its support is missing in tools like [https://lldb.llvm.org/ LLDB], [[llvm-dwarfdumphttps://llvm.org/docs/CommandGuide/llvm-dwarfdump.html%7Cllvm-dwarfdump]] or binutils readelf. -fdebug-types-section is used internally by Google (produced by clang). Debian does not store any debug info archives. Ubuntu uses neither -fdebug-types-section nor DWZ.
This is not true. Debian started producing -dbgsym packages and putting them in a separate repository years ago: https://wiki.debian.org/AutomaticDebugPackages
dwz is used by virtually all RPM based distributions now, including OpenMandriva (a clang-based distro). I know this because I implemented it. :)
I do not know whether Debian has started using dwz by default because I haven't dug into how the -dbgsym package generation works in detail.
Most of the packages that use recent versions of debhelper (the tool that automates many steps of the Debian packaging) have run dwz for the past couple of years. I do not have any statistics though.
Then that certainly means that Ubuntu uses this too, since they reuse the dbgsym subpackage generation for the ddeb system they have now.
So it sounds like the underlying premise of this whole Change is flawed, since everyone has been using dwz without telling the dwz developers. :)
-- 真実はいつも一つ!/ Always, there's only one truth!
On Thu, 24 Sep 2020 19:16:32 +0200, Neal Gompa wrote:
Then that certainly means that Ubuntu uses this too, since they reuse the dbgsym subpackage generation for the ddeb system they have now.
I am not much familiar with Debian/Ubuntu but I cannot find any use of DWZ there: https://packages.ubuntu.com/groovy/amd64/bluez-dbg/download llvm-dwarfdump -color=0 bluez-dbg_5.55-0ubuntu1_amd64/data/usr/lib/debug/.build-id/*/*.debug|grep DW_TAG_partial_unit
This debuginfo package has been built 2020-09-15.
(Besides that this proposal is not based on whether Debian uses DWZ or not.)
Thanks, Jan
On Thu, 24 Sep 2020 at 13:44, Jan Kratochvil jan.kratochvil@redhat.com wrote:
On Thu, 24 Sep 2020 19:16:32 +0200, Neal Gompa wrote:
Then that certainly means that Ubuntu uses this too, since they reuse the dbgsym subpackage generation for the ddeb system they have now.
I am not much familiar with Debian/Ubuntu but I cannot find any use of DWZ there: https://packages.ubuntu.com/groovy/amd64/bluez-dbg/download llvm-dwarfdump -color=0 bluez-dbg_5.55-0ubuntu1_amd64/data/usr/lib/debug/.build-id/*/*.debug|grep DW_TAG_partial_unit
This debuginfo package has been built 2020-09-15.
(Besides that this proposal is not based on whether Debian uses DWZ or not.)
The original language of the proposal said no other distribution used DWZ, and that the format was not adopted and should be removed. So it comes across that it is based on whether Debian, Ubuntu, etc use it.
``` As the format did not get widespread and the tool is not much maintained it became burden to make existing debugging tools compatible with Fedora debug info. .... Almost nobody uses existing Fedora DWZ (only Fedora/CentOS/RHEL and SuSE OSes) and so its support is missing in tools like [https://lldb.llvm.org/ LLDB], [[llvm-dwarfdumphttps:// llvm.org/docs/CommandGuide/llvm-dwarfdump.html|llvm-dwarfdump]] or binutils readelf. -fdebug-types-section is used internally by Google (produced by clang). Debian does not store any debug info archives. Ubuntu uses neither -fdebug-types-section nor DWZ.
```
Just stick to the following:
The tool is not easily maintained, and has become a burden to make existing debugging tools, namely llvm, compatible with this method.
Also expect that cross-distribution support is going to be important. No distribution is an island entire of itself; and few 'customers' use just one distribution. If a lot of distributions have been using this because Fedora had been and it was easier to work out things.. then work is going to be needed to get them to work together..
On Thu, Sep 24, 2020 at 2:04 PM Stephen John Smoogen smooge@gmail.com wrote:
On Thu, 24 Sep 2020 at 13:44, Jan Kratochvil jan.kratochvil@redhat.com wrote:
On Thu, 24 Sep 2020 19:16:32 +0200, Neal Gompa wrote:
Then that certainly means that Ubuntu uses this too, since they reuse the dbgsym subpackage generation for the ddeb system they have now.
I am not much familiar with Debian/Ubuntu but I cannot find any use of DWZ there: https://packages.ubuntu.com/groovy/amd64/bluez-dbg/download llvm-dwarfdump -color=0 bluez-dbg_5.55-0ubuntu1_amd64/data/usr/lib/debug/.build-id/*/*.debug|grep DW_TAG_partial_unit
This debuginfo package has been built 2020-09-15.
(Besides that this proposal is not based on whether Debian uses DWZ or not.)
The original language of the proposal said no other distribution used DWZ, and that the format was not adopted and should be removed. So it comes across that it is based on whether Debian, Ubuntu, etc use it.
As the format did not get widespread and the tool is not much maintained it became burden to make existing debugging tools compatible with Fedora debug info. .... Almost nobody uses existing Fedora DWZ (only Fedora/CentOS/RHEL and SuSE OSes) and so its support is missing in tools like [https://lldb.llvm.org/ LLDB], [[llvm-dwarfdumphttps://llvm.org/docs/CommandGuide/llvm-dwarfdump.html|llvm-dwarfdump]] or binutils readelf. -fdebug-types-section is used internally by Google (produced by clang). Debian does not store any debug info archives. Ubuntu uses neither -fdebug-types-section nor DWZ.
For the record, the reason why it was hard to broaden adoption is that the patch wasn't upstreamed into rpm itself until RPM 4.14's release: https://rpm.org/wiki/Releases/4.14.0.html
That was only three years ago, and in the span of that time, it's gone from only Fedora using it to almost everyone using it now.
Just stick to the following:
The tool is not easily maintained, and has become a burden to make existing debugging tools, namely llvm, compatible with this method.
Also expect that cross-distribution support is going to be important. No distribution is an island entire of itself; and few 'customers' use just one distribution. If a lot of distributions have been using this because Fedora had been and it was easier to work out things.. then work is going to be needed to get them to work together..
I do not feel that this is a valid premise either, since the reason for no dwz support in LLDB is because nobody contributed it. I'm slightly surprised that Red Hat's debuginfo engineers hadn't already contributed support for it into LLDB. I wonder if the reason for that was the mistaken impression that dwz wasn't broadly used.
On Thu, 24 Sep 2020 20:10:45 +0200, Neal Gompa wrote:
I do not feel that this is a valid premise either, since the reason for no dwz support in LLDB is because nobody contributed it. I'm slightly surprised that Red Hat's debuginfo engineers hadn't already contributed support for it into LLDB.
I have implemented the support for DWZ into LLDB, it is just off-trunk now: https://people.redhat.com/jkratoch/dwz-2020-05-31/ https://copr.fedorainfracloud.org/coprs/jankratochvil/lldb/package/lldb-expe... dnf copr enable jankratochvil/lldb;dnf install lldb-experimental;lldb-experimental
The problem is that the support is complicated as it has to affect the whole LLDB DWARF codebase (handling of each DIE type). That is the line based on my patchset, it is exactly calculated: https://fedoraproject.org/wiki/Changes/DebugInfoStandardization#Detailed_Des... DWZ disadvantage: DWZ requires 8x times more complicated (LoC count) support in consumers than -fdebug-types-section.
And all this support is needed despite almost nobody from LLDB/LLVM users (Android, iOS, OSX, still AFAIK Ubuntu/Debian) uses DWZ.
So when DWZ brings almost no size benefits compared to much easier to support, faster to compile and already better supported by existing consumers (*) why not to switch to -fdebug-types-section?
(*) such as LLDB and LLVM binutils but DWZ strings are not decoded even for example with binutils readelf
I wonder if the reason for that was the mistaken impression that dwz wasn't broadly used.
From LLDB point of view users of DWZ-built Linux distros are counted in millions, users of Android/iOS/OSX are counted in billions. :-)
Thanks for reply, Jan
On Thu, 24 Sep 2020 20:04:22 +0200, Stephen John Smoogen wrote:
The original language of the proposal said no other distribution used DWZ, and that the format was not adopted and should be removed.
I have already updated the Wiki in the meantime based on new information from this list:
https://fedoraproject.org/wiki/Changes/DebugInfoStandardization#Detailed_Des... ------------------------------------------------------------------------------ Only some of Linux distributions use existing Fedora DWZ (known are Fedora/CentOS/RHEL, SuSE OSes, OpenMandriva, maybe others?). [...] Debian and Ubuntu use neither -fdebug-types-section nor DWZ. ------------------------------------------------------------------------------
The tool is not easily maintained,
I did not mention it there but it is true there are some longterm unfixed bugs in DWZ so that it gives up on optimization of many builds: error: Allocatable section after non-allocatable ones https://sourceware.org/bugzilla/show_bug.cgi?id=24251#c10 error: Couldn't find DIE referenced by DW_AT_abstract_origin (maybe DWZ error or maybe compiler error: Unknown DWARF DW_OP_255)
Thanks, Jan
On Thu, Sep 24, 2020 at 08:34:48PM +0200, Jan Kratochvil wrote:
The tool is not easily maintained,
I did not mention it there but it is true there are some longterm unfixed bugs in DWZ so that it gives up on optimization of many builds: error: Allocatable section after non-allocatable ones https://sourceware.org/bugzilla/show_bug.cgi?id=24251#c10
That isn't a longterm unfixed bug. It is is a comment by you from a couple of days ago that you could still reproduce an already fixed bug. So we reopened the bug and asked you for a reproducer. Which you didn't provide. When we then tried to replicate the issue ourselves we found that dwz correctly seems to detect an issue with the golang toolchain.
error: Couldn't find DIE referenced by DW_AT_abstract_origin (maybe DWZ error or maybe compiler error: Unknown DWARF DW_OP_255)
We cannot fix bugs if you don't report them. Please file bugs for the above issue (including how to reproduce them) and we'll take a look.
Cheers,
Mark
On Fri, 25 Sep 2020 11:36:47 +0200, Mark Wielaard wrote:
On Thu, Sep 24, 2020 at 08:34:48PM +0200, Jan Kratochvil wrote:
error: Allocatable section after non-allocatable ones https://sourceware.org/bugzilla/show_bug.cgi?id=24251#c10
That isn't a longterm unfixed bug.
It is a long-term incompatible behavior, one can argue whether it is a golang bug or DWZ bug, opinions differ.
error: Couldn't find DIE referenced by DW_AT_abstract_origin (maybe DWZ error or maybe compiler error: Unknown DWARF DW_OP_255)
We cannot fix bugs if you don't report them. Please file bugs for the above issue (including how to reproduce them) and we'll take a look.
Why should I report them all? I am trying to fix Fedora by removing DWZ.
If you want to improve DWZ you have accessible Fedora build.logs for all the years DWZ is failing there. You did nothing with it. Moreover I have even in-company provided to you in recent months access to all the build logs on a single machine so that you can check it even more easily. Again you did nothing with it.
Jan
* Jan Kratochvil:
On Thu, 24 Sep 2020 19:16:32 +0200, Neal Gompa wrote:
Then that certainly means that Ubuntu uses this too, since they reuse the dbgsym subpackage generation for the ddeb system they have now.
I am not much familiar with Debian/Ubuntu but I cannot find any use of DWZ there: https://packages.ubuntu.com/groovy/amd64/bluez-dbg/download llvm-dwarfdump -color=0 bluez-dbg_5.55-0ubuntu1_amd64/data/usr/lib/debug/.build-id/*/*.debug|grep DW_TAG_partial_unit
This debuginfo package has been built 2020-09-15.
This is not a -dbgsym package, so it probably has been created by a different procedure. I do not know how Ubuntu distributes their -dbgsym packages. An example from Debian with .dwz paths is here:
Thanks, Florian
On Fri, Sep 25, 2020 at 3:08 AM Florian Weimer fweimer@redhat.com wrote:
- Jan Kratochvil:
On Thu, 24 Sep 2020 19:16:32 +0200, Neal Gompa wrote:
Then that certainly means that Ubuntu uses this too, since they reuse the dbgsym subpackage generation for the ddeb system they have now.
I am not much familiar with Debian/Ubuntu but I cannot find any use of DWZ there: https://packages.ubuntu.com/groovy/amd64/bluez-dbg/download llvm-dwarfdump -color=0 bluez-dbg_5.55-0ubuntu1_amd64/data/usr/lib/debug/.build-id/*/*.debug|grep DW_TAG_partial_unit
This debuginfo package has been built 2020-09-15.
This is not a -dbgsym package, so it probably has been created by a different procedure. I do not know how Ubuntu distributes their -dbgsym packages. An example from Debian with .dwz paths is here:
Ubuntu *definitely* has it. Checking "alsa-utils" from Ubuntu 20.04 shows dwz data.
Cf. http://ddebs.ubuntu.com/pool/main/a/alsa-utils/alsa-utils-dbgsym_1.2.2-1ubun...
On Fri, 25 Sep 2020 11:01:53 +0200, Neal Gompa wrote:
On Fri, Sep 25, 2020 at 3:08 AM Florian Weimer fweimer@redhat.com wrote:
This is not a -dbgsym package, so it probably has been created by a different procedure. I do not know how Ubuntu distributes their -dbgsym packages. An example from Debian with .dwz paths is here:
Ubuntu *definitely* has it. Checking "alsa-utils" from Ubuntu 20.04 shows dwz data.
Cf. http://ddebs.ubuntu.com/pool/main/a/alsa-utils/alsa-utils-dbgsym_1.2.2-1ubun...
Thanks for the info, I have already updated the Change text now.
Jan
On Thu, 24 Sep 2020 19:01:17 +0200, Neal Gompa wrote:
This is not true. Debian started producing -dbgsym packages and putting them in a separate repository years ago: https://wiki.debian.org/AutomaticDebugPackages
dwz is used by virtually all RPM based distributions now, including OpenMandriva (a clang-based distro). I know this because I implemented it. :)
Updated the Wiki, thanks. https://fedoraproject.org/wiki/Changes/DebugInfoStandardization
I do not know whether Debian has started using dwz by default because I haven't dug into how the -dbgsym package generation works in detail.
I do not see DWZ or -fdebug-types-section to be used for example in bash-dbgsym_5.1~beta1-1_amd64.deb .
Jan
Ben Cotton wrote on Thu, Sep 24, 2020:
** If the 3.3% size increase is a concern I can implement a different optimization ([https://whova.com/embedded/session/llvm_202010/1193947/ talk (2)]) as a GCC post-processing phase which would require no changes in any DWARF consumers.
That talk doesn't load for me, sorry if I ask something answered in there.
How does this relate to debuginfo compression such as passing -Wl,--compress-debug-sections=zlib ?
I haven't tested on very relevant programs but on a single C file (tested with vmtouch, compiled with f32's `gcc -g vmtouch.c` and `gcc -g vmtouch.c -Wl,--compress-debug-sections=zlib`) I see a fairly significant reduction in size: after strip --only-keep-debug it goes down from 32K to 23K, that's 28% down.
On a slighly bigger project (wlroots), with LDFLAGS=-Wl,--compress-debug-sections=zlib meson builddir and the same strip; I'm going from 1512 to 732k that's over 50% reduction.
As far as I can see, gdb and lldb both support it just fine; and the linux kernel builds with it if CONFIG_DEBUG_INFO_COMPRESSED is set so it looks widespread enough.
If not related would it be worth using? Is support somehow still lacking?
Thanks,
On Thu, 24 Sep 2020 21:20:38 +0200, Dominique Martinet wrote:
Ben Cotton wrote on Thu, Sep 24, 2020:
** If the 3.3% size increase is a concern I can implement a different optimization ([https://whova.com/embedded/session/llvm_202010/1193947/ talk (2)]) as a GCC post-processing phase which would require no changes in any DWARF consumers.
That talk doesn't load for me, sorry if I ask something answered in there.
I have added a title there now but the URL loads for me even in lynx+wget. Copy-pasted it at the bottom of this mail. I do not know the talk but TL;DR existing DWARF contains some dead DIEs - unused/deduplicated functions and also -fdebug-types-section declarations/skeletons which can be removed or converted to direct DIE references respectively. That way one could reduce the size like DWZ does but without needing any new complicated support in DWARF consumers.
How does this relate to debuginfo compression such as passing -Wl,--compress-debug-sections=zlib ?
That is orthogonal - that is one can add it to DWZ or -fdebug-types-section the same way. It would be for another Fedora Change proposal but I do not think it matters for F-33 as it already implements: https://fedoraproject.org/wiki/Changes/BtrfsByDefault#Compression
I haven't yet checked whether that applies to /usr/lib/debug/ by default. btrfs is using zstd which has better performance than zlib. I was considering adding an ELF section compression extension for zstd but with btrfs transparent compression that looks as not useful.
Some people have concern about performance of debuggers/tools loading such compressed *.debug files, my benchmarks show only at most 10% performance hit.
I have calculated for Fedora Rawhide -Wl,--compress-debug-sections=zlib saves 52.84% of on-disk *.debug size. But that does not apply to *-debuginfo.rpm as rpms are already compressed.
That 3.3% size reduction=advantage of DWZ against -fdebug-types-section is calculated for *-debuginfo.rpm (3.3% is for the whole distribution incl. binaries, for debug/ itself it is 6.35%). Also it is calculated for DWARF-4, F-34 will hopefully switch to DWARF-5 (which is smaller by 10-20%) but DWZ is not yet ported to DWARF-5 so it is impossible to compare -fdebug-types-section vs. DWZ size for DWARF-5.
As far as I can see, gdb and lldb both support it just fine; and the linux kernel builds with it if CONFIG_DEBUG_INFO_COMPRESSED is set so it looks widespread enough.
Yes, zlib ELF section compression is well supported.
Jan
https://whova.com/embedded/session/llvm_202010/1193947/ 2) Fragmenting the DWARF to Enable Dead Debug Data Elimination Speaker: James Henderson
Standard DWARF defines a series of sections in the output, with one of each per object file. Each of these sections may have information about every function and variable in that unit. Linkers typically leave this information intact, referencing address 0 (or other tombstone value) when a function or variable���������s section is discarded, as the debug sections still contain used information. This approach has issues such as potential ambiguity and excessive space usage.
This talk will present a solution to these issues, leveraging existing ELF features, which enables linkers to discard dead pieces of DWARF without the linker requiring any special knowledge of its structure. It will also include performance figures to evaluate the approach.
Jan Kratochvil wrote on Thu, Sep 24, 2020:
That talk doesn't load for me, sorry if I ask something answered in there.
I have added a title there now but the URL loads for me even in lynx+wget.
Yeah sorry it finally loaded after 10+ minutes, that was weird.
Copy-pasted it at the bottom of this mail. I do not know the talk but TL;DR existing DWARF contains some dead DIEs - unused/deduplicated functions and also -fdebug-types-section declarations/skeletons which can be removed or converted to direct DIE references respectively. That way one could reduce the size like DWZ does but without needing any new complicated support in DWARF consumers.
Ok, avoiding duplicate data makes sense there is quite a lot in there.
That is orthogonal - that is one can add it to DWZ or -fdebug-types-section the same way. It would be for another Fedora Change proposal but I do not think it matters for F-33 as it already implements: https://fedoraproject.org/wiki/Changes/BtrfsByDefault#Compression
Good point. I did think of rpm size (double compression doesn't work well and rpms use better compression than zlib) but not filesystem compression. Everyone won't benefit from that right away but I guess it makes sense.
I haven't yet checked whether that applies to /usr/lib/debug/ by default. btrfs is using zstd which has better performance than zlib. I was considering adding an ELF section compression extension for zstd but with btrfs transparent compression that looks as not useful.
I don't have very much there but it does work well: # compsize /usr/lib/debug/ Processed 720 files, 2232 regular extents (2239 refs), 1 inline. Type Perc Disk Usage Uncompressed Referenced TOTAL 32% 74M 229M 230M none 100% 644K 644K 644K zstd 32% 73M 229M 229M
That 3.3% size reduction=advantage of DWZ against -fdebug-types-section is calculated for *-debuginfo.rpm (3.3% is for the whole distribution incl. binaries, for debug/ itself it is 6.35%). Also it is calculated for DWARF-4, F-34 will hopefully switch to DWARF-5 (which is smaller by 10-20%) but DWZ is not yet ported to DWARF-5 so it is impossible to compare -fdebug-types-section vs. DWZ size for DWARF-5.
Ok. That definitely makes more sense to me, thanks for clarifying this.
On Fri, 25 Sep 2020 07:08:27 +0200, Dominique Martinet wrote:
Jan Kratochvil wrote on Thu, Sep 24, 2020:
Copy-pasted it at the bottom of this mail. I do not know the talk but TL;DR existing DWARF contains some dead DIEs - unused/deduplicated functions and also -fdebug-types-section declarations/skeletons which can be removed or converted to direct DIE references respectively. That way one could reduce the size like DWZ does but without needing any new complicated support in DWARF consumers.
Ok, avoiding duplicate data makes sense there is quite a lot in there.
Duplicate data is what DWZ and -fdebug-types-section is about.
This other optimization - only draft-implemented for clang so far - removes completely dead/unused/unusable information. That is debuginfo for functions which were originally compiled but later removed as either unused or duplicate with other existing functions. But their debuginfo cannot be easily removed so it is only disabled by setting its debug info address to zero. Some approx. calculation of possible size saving removing those debug info entries I wrote in: https://git.jankratochvil.net/?p=massrebuild.git;a=blob_plain;f=dwarfredunda... I have calculated the saving on Fedora Rawhide *-debuginfo.rpm as 5.96%. That is together with removing -fdebug-types-section skeletons. It is only approximate as it does not calculate savings on removed abbreviations, OTOH it does not calculate absolute DIE references as replacements. One can still expect the real saving would be slightly bigger (better).
I do not think it matters much (does it?) but if someone wants to advocate DWZ due to its size savings then this new optimization: * it saves the same amount of data as DWZ * it needs absolutely no new support in DWARF consumers * it has no overhead to separate *.debug files download; DWZ has 6.35% size of *-debuginfo.rpm overhead due to the DWZ common files.
Sure one could apply both this dead-DIE removal AND DWZ together which would get some approx. 6%+7%=13% of *-debuginfo.rpm reduction.
I don't have very much there but it does work well: # compsize /usr/lib/debug/ Processed 720 files, 2232 regular extents (2239 refs), 1 inline. Type Perc Disk Usage Uncompressed Referenced TOTAL 32% 74M 229M 230M none 100% 644K 644K 644K zstd 32% 73M 229M 229M
Nice, thanks for checking it.
Jan
That is orthogonal - that is one can add it to DWZ or -fdebug-types-section the same way. It would be for another Fedora Change proposal but I do not think it matters for F-33 as it already implements: https://fedoraproject.org/wiki/Changes/BtrfsByDefault#Compression
I added a note about 2.5 weeks ago.
NOTE: These optimizations will not be used by default in Fedora 33, but are available for opt-in adoption and evaluation.
Hi,
Replying since I am mentioned by name in this proposal and it seems to argue for removing a feature I am currently working on to make sure it works correctly with GCC11 if it switches to producing DWARF5 by default. The proposal seems based on some misunderstandings.
On Thu, Sep 24, 2020 at 11:59:44AM -0400, Ben Cotton wrote:
https://fedoraproject.org/wiki/Changes/DebugInfoStandardization
== Summary == Fedora 18 implemented [[Features/DwarfCompressor]]. As the format did not get widespread and the tool is not much maintained
As others pointed out dwz is widely used. It is used by almost every distro in some form and even freedesktop.org flatpaks use dwz for their debuginfo.
The upstream project is actively maintained. Even though there are just 3 committers (including me) the project is still seeing ~2.5 commits a week (about 130 in a year).
There exist several methods how to make the *-debuginfo.rpms at least a bit smaller. Fedora 18 started using DWZ tool (from [[Features/DwarfCompressor]]) while [https://gcc.gnu.org/pipermail/gcc-patches/2008-August/246281.html Google implemented] the same goal in a different way called -fdebug-types-section.
Note that these methods are not in conflict. Both started out as GNU extensions but have been standardized since.
so its support is missing in tools like [https://lldb.llvm.org/ LLDB],
But you have been maintaining an out of tree patch for several years to support partial units and supplemental files (both of which dwz produces and are now standard DWARF). It would be good if you upstreamed those patches.
[[llvm-dwarfdumphttps://llvm.org/docs/CommandGuide/llvm-dwarfdump.html%7Cllvm-dwarfdump]]
Note that normal dwarfdump as shipped with libdwarf-tools does support both partial units and supplemental files (you do need to provide the alt file with --file-tied=/path/to/alt-file which probably should be done automatically).
or binutils readelf.
binutils readelf is used as the main tool in the dwz testsuite. It certainly supports both partial units and supplemental files. Also note the the -wK (=follow-links) option [it isn't on by default, maybe it should] that resolves alt links.
I don't know of any other tool which doesn't support either partial units or supplemental files, since both are (now) standard DWARF.
- DWZ disadvantage: DWZ requires 8x times more complicated (LoC count)
support in consumers than -fdebug-types-section.
I have worked on support for both in various consumers and didn't find one method more difficult to support than the other. Maybe debug-types was actually more difficult. Because the representation changed from separate section in DWARF4, to intermingled with other .debug_info units in DWARF5. Making supporting objects that contained mixed versions a bit of a pain.
- DWZ disadvantage: DWZ cannot update LLVM .debug_names index which
can be generated only by clang (it cannot be regenerated later for DWZ-compressed file)
dwz does support .gdb_index, the pre-standardized DWARF5 .debug_names variant. gdb hasn't switched to supporting .debug_names yet and .gdb_index does work with DWARF5. I don't think the gdb maintainers would mind you adding .debug_names support if you believe it to be better than .gdb_index.
- DWZ disadvantage: DWZ DWARF-5 support is a work-in-progress. DWZ has
been blocking DWARF-5 for Fedora for 3.5 years and only after I have now proposed to drop DWZ Mark Wielaard has started porting DWZ to DWARF-5.
I have worked on DWARF5 support for various projects for the last couple of years. Since it looks like GCC11 might switch to DWARF5 by default I started added DWARF5 support to dwz. I started that earlier this month after we had a discussion on DWARF5 at the virtual GNU Tools Cauldron, I didn't know you were proposing to drop DWZ in Fedora. You can follow the progress on the dwz mailinglist. https://sourceware.org/pipermail/dwz/current/
This proposed DWARF format was originally submitted already for Fedora 18 as [[Features/DebugTypesSections]].
Note that it isn't a different format, just an additional way to represent some parts of DWARF. I have also proposed gcc would emit debug-types by default and even discussed it two times at the GNU Tools Cauldron. But I couldn't get buy-in from the GCC hackers that it is a good idea. And I do think they have a point. It is a not very flexible design with a somewhat high overhead (somewhat reduced in DWARF5 by not making it part of a separate section). If you think it really is a good idea to use them more broadly and by default then you should probably start by getting consensus from the upstream gcc developers that it should be enabled by default. I don't think it is in conflict with also using dwz. But it will probably require some extra work.
Cheers,
Mark
On Fri, 25 Sep 2020 01:35:43 +0200, Mark Wielaard wrote:
Replying since I am mentioned by name in this proposal and it seems to argue for removing a feature I am currently working on to make sure it works correctly with GCC11 if it switches to producing DWARF5 by default.
The problem is you are not working on DWARF-5 features produced by LLVM so even your planned DWZ is still not usable for LLVM-built binaries.
In the case DWZ is used for GCC-built binaries what to do with LLVM-built binaries?
(1) Build them without DWZ nor -fdebug-types-section. They will get up to 2x as big. (2) Build them with -fdebug-types-section. Then the distro tools need to be -fdebug-types-section compatible anyway (which they already are modulo testing). (3) You were proposing to build them as DWARF-4. That regresses functionality of Fedora compared to other OSes. It is also unfair to restrict LLVM due to a deficiency of DWZ.
On Thu, Sep 24, 2020 at 11:59:44AM -0400, Ben Cotton wrote: As others pointed out dwz is widely used. It is used by almost every distro in some form and even freedesktop.org flatpaks use dwz for their debuginfo.
As Debian really uses DWZ (as pointed out by Florian Weimer) I have updated the Change text now.
The upstream project is actively maintained.
Fedora has been waiting for DWARF-5 for 3.5 years due to DWZ. So I do not find it as maintained. But I have removed the sentence if there is a disagreement on it.
Even though there are just 3 committers (including me) the project is still seeing ~2.5 commits a week (about 130 in a year).
Still existing bugs are not fixed for years as nobody looks at build.log's.
There exist several methods how to make the *-debuginfo.rpms at least a bit smaller. Fedora 18 started using DWZ tool (from [[Features/DwarfCompressor]]) while [https://gcc.gnu.org/pipermail/gcc-patches/2008-August/246281.html Google implemented] the same goal in a different way called -fdebug-types-section.
Note that these methods are not in conflict. Both started out as GNU extensions but have been standardized since.
They are in conflict as DWZ still does not support -fdebug-types-section (after 12 years of -fdebug-types-section and 8 years of DWZ).
In binaries built with -fdebug-types-section DWZ cannot use -m DWZ common file and in such case DWZ produces 1.6% bigger Fedora distro *-debuginfo.rpm than simple -fdebug-types-section. That means running DWZ after -fdebug-types-section is pointless.
so its support is missing in tools like [https://lldb.llvm.org/ LLDB],
But you have been maintaining an out of tree patch for several years to support partial units and supplemental files (both of which dwz produces and are now standard DWARF). It would be good if you upstreamed those patches.
That's the question why I submitted this Change. From my point of view the DWZ technology makes no sense and so I find a better fix to drop DWZ. I am interesting in opinion of Fedora / FESCo.
or binutils readelf.
binutils readelf is used as the main tool in the dwz testsuite. It certainly supports both partial units and supplemental files. Also note the the -wK (=follow-links) option [it isn't on by default, maybe it should] that resolves alt links.
I was not aware of it (it is from 2017-11-15). I have updated the Change text. Yes, it definitely should be on by default.
I don't know of any other tool which doesn't support either partial units or supplemental files, since both are (now) standard DWARF.
GCC also does not support most of DWARF-5 after it is standardized for 3.5 years (only these days you started implementing better DWARF-5 support).
And the DWARF-5 standard even still does not specify .debug_names for DWZ, that has been apparently forgotten.
- DWZ disadvantage: DWZ requires 8x times more complicated (LoC count)
support in consumers than -fdebug-types-section.
I have worked on support for both in various consumers and didn't find one method more difficult to support than the other. Maybe debug-types was actually more difficult. Because the representation changed from separate section in DWARF4, to intermingled with other .debug_info units in DWARF5. Making supporting objects that contained mixed versions a bit of a pain.
It is easy for primitive tools like GDB which do DWARF->IR (Internal Representation) conversion for whole CUs (Compilation Units = per .o file) and not per DIE (one element like a variable/type/function). That was fine in 80s but not now with big C++ template libraries. That means on real C++ code GDB consumes about 20GB of memory and 5 minutes of runtime to print a variable. That does not happen much for Fedora packages as most of the packages are not in C++ and the C++ ones are not much heavily templatized (there are exceptions). LLDB makes many indexes of the DIEs to be effective and with DWZ all of those need to be modified to track also DWZ parent importer units indicating dictionaries where the IRs should be placed.
- DWZ disadvantage: DWZ cannot update LLVM .debug_names index which
can be generated only by clang (it cannot be regenerated later for DWZ-compressed file)
dwz does support .gdb_index, the pre-standardized DWARF5 .debug_names variant. gdb hasn't switched to supporting .debug_names yet and .gdb_index does work with DWARF5. I don't think the gdb maintainers would mind you adding .debug_names support if you believe it to be better than .gdb_index.
GDB does not need .debug_names because it does the simple but ineffective full CU IR conversion. .gdb_index cannot be used by effective LLDB as .gdb_index does not contain the essential DIE offsets at all.
It is a not very flexible design with a somewhat high overhead (somewhat reduced in DWARF5 by not making it part of a separate section).
Whether the section is in ".debug_info" or ".debug_types" is a few lines of code. But tracking importing CUs are thousands of lines of code.
I understand DWZ is not a problem for trivial utilities, that is not the case of LLDB.
If you think it really is a good idea to use them more broadly and by default then you should probably start by getting consensus from the upstream gcc developers that it should be enabled by default.
I could discuss making -fdebug-types-section the default for clang as I have switched to clang in 2013 and I have never looked back.
I don't think it is in conflict with also using dwz. But it will probably require some extra work.
The whole DWZ question is whether to use DWZ common files. DWZ common files are in /usr/lib/debug/.dwz/ , there is always one file per *-debuginfo.rpm and they do save some *.debug size by placing big class definitions to a single place for the whole rpm. This way DWZ has 6.35% (for DWARF-4) smaller *-debuginfo.rpm size (although the 6.35% has stddev +/-11%, it depends A LOT on which package you DWZ).
OTOH when downloading *.debug files separately (such as by debuginfod https://sourceware.org/elfutils/Debuginfod.html ) the DWZ common files make the total download sizes bigger by 6% in average compared to -fdebug-types-section.
So the DWZ common files have some advantages and disadvantages. And given I believe nobody cares about few percents of *-debuginfo.rpm size I do not find it worth the trouble.
In the case one no longer uses DWZ common files (dwz option -m) then DWZ produces *-debuginfo.rpm 1.6% bigger than more simple -fdebug-types-section. So without DWZ common files I find obvious DWZ would really make no sense.
Thanks for reply, Jan
Hi Jan,
On Fri, 2020-09-25 at 11:43 +0200, Jan Kratochvil wrote:
On Fri, 25 Sep 2020 01:35:43 +0200, Mark Wielaard wrote:
Replying since I am mentioned by name in this proposal and it seems to argue for removing a feature I am currently working on to make sure it works correctly with GCC11 if it switches to producing DWARF5 by default. The proposal seems based on some misunderstandings.
The problem is you are not working on DWARF-5 features produced by LLVM so even your planned DWZ is still not usable for LLVM-built binaries. [...] It is also unfair to restrict LLVM due to a deficiency of DWZ. [...] Fedora has been waiting for DWARF-5 for 3.5 years due to DWZ. [...] From my point of view the DWZ technology makes no sense and so I find a better fix to drop DWZ. [...] GCC also does not support most of DWARF-5 after it is standardized for 3.5 years (only these days you started implementing better DWARF-5 support). [...] It is easy for primitive tools like GDB [...] I understand DWZ is not a problem for trivial utilities, that is not the case of LLDB. [...] I have switched to clang in 2013 and I have never looked back.
You aren't really making sure to win people over if you tell others they aren't working fast enough, they aren't prioritizing your bugs, disparaging others work because they do have implemented support for newer DWARF constructs, just not in a way you would have done it, and not being willing to help out with fixing things because you aren't personally interested in the Fedora default GNU toolchain.
DWZ works totally fine with llvm produced binaries, there is even a whole distro build with clang that uses DWZ (it just doesn't use DWARF5 by default). Your comments about GCC are also wrong. GCC does support and produce almost any construct DWARF-5 specifies (after all, they used to be GNU extensions before), it just doesn't use certain forms or index tables unless you are producing split-dwarf (which like debug- types also isn't enabled by default).
If you want to make -fdebug-types-sections the default you really should work with the upstream GCC developers to figure out why they don't want that. Trying to override upstream defaults in Fedora without understanding why upstream decided on the current defaults isn't a good idea IMHO.
I totally get that it is frustrating if you worked for a long time on a new feature to support some DWARF constructs for lldb and your aren't able to get the patches in shape to be accepted upstream. But I am happy people now know about your patches and seem to find them useful.
You say it is difficult to support DWARF partial units as generated by dwz in lldb, but dwz doesn't really do anything non-standard (and GCC with LTO also generates partial units). If your complaint is that partial unit DIEs are missing some for your use case essential attributes, then we can look at adding them (note that Tom de Vries has experimented with --devel-gen-cu and --devel-uni-lang in dwz git, which might provide you with something you can use, see the dwz commit messages for some more background).
I do find your statistics per package useful because they show dwz is in general effective by producing at least 20% (more) on-disk size reduction, even though there are some packages where dwz doesn't seem as effective as it could be. We definitely should investigate those issues. When I looked at the build.logs in the past it did show some issues with either the DWARF producers, rpm debugedit or dwz. Please do report bugs if they aren't known yet to the upstream projects.
But I don't really understand why you then focus on the zstd compressed rpms (even if even those favor dwz). That simply shows that zstd compression seems to work pretty well. But it doesn't show the on-disk (and in-memory) size reductions. Or why for the debuginfod use case you seem to do the opposite, not take into account that the http debuginfod server will compress the files before sending over the network. I don't think either of those later statistics are really relevant with respect to your proposal.
Finally I am interested in your proposal to implement a different way to reduce the size of DIE trees by eliminating "unused" DIEs. It is hard to predict what effect that would have without seeing an implementation (in theory GCC with LTO would not actually generate debuginfo for unused functions). But I think that can be done separate from your proposal and combined with other size reduction techniques.
Cheers,
Mark
On Mon, Sep 28, 2020 at 12:31:59PM +0200, Mark Wielaard wrote:
Finally I am interested in your proposal to implement a different way to reduce the size of DIE trees by eliminating "unused" DIEs. It is hard to predict what effect that would have without seeing an implementation (in theory GCC with LTO would not actually generate debuginfo for unused functions). But I think that can be done separate from your proposal and combined with other size reduction techniques.
And note that GCC already does implement -feliminate-unused-debug-{symbols,types} which are enabled by default and are (at least in my eyes) sometimes too aggressive, so by eliminating even further DIEs the debug experience might be even worse.
Jakub
On Mon, 28 Sep 2020 12:42:33 +0200, Jakub Jelinek wrote:
On Mon, Sep 28, 2020 at 12:31:59PM +0200, Mark Wielaard wrote:
Finally I am interested in your proposal to implement a different way to reduce the size of DIE trees by eliminating "unused" DIEs. It is hard to predict what effect that would have without seeing an implementation (in theory GCC with LTO would not actually generate debuginfo for unused functions). But I think that can be done separate from your proposal and combined with other size reduction techniques.
And note that GCC already does implement -feliminate-unused-debug-{symbols,types} which are enabled by default and are (at least in my eyes) sometimes too aggressive, so by eliminating even further DIEs the debug experience might be even worse.
git clone git://git.jankratochvil.net/massrebuild ./dwarfredundant lldb-debuginfo-11.0.0-0.2.rc3.fc34.x86_64/usr/lib/debug/usr/lib64/liblldb.so.11.0.0-11.0.0-0.2.rc3.fc34.x86_64.debug
For example:
saved=27: 0x0193b058: DW_TAG_subprogram [47] * DW_AT_abstract_origin [DW_FORM_ref_addr] (0x000000000515d936 "_ZN12lldb_private17StoppointLocationD2Ev") DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_high_pc [DW_FORM_udata] (1) DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_call_frame_cfa) DW_AT_GNU_all_call_sites [DW_FORM_flag_present] (true) DW_AT_sibling [DW_FORM_ref_udata] (cu + 0x1da4f => {0x0193b073}) 0x0193b06b: DW_TAG_formal_parameter [55] DW_AT_abstract_origin [DW_FORM_ref_addr] (0x000000000515d941) DW_AT_location [DW_FORM_exprloc] (DW_OP_reg5 RDI) 0x0193b072: NULL
This DIE describes only a concrete function instance at address 0x0. No function can exist on address 0x0 on x86_64, that is a discarded/deduplicated function: [Dwarf-Discuss] DWARF for linker GC'd code http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2020-July/004...
I do not see what any DWARF consumer could find out from such a DIE.
And there are many such DIEs, something a bit less than 28% of what DWZ saves (28% is incl. removal of DW_UT_type declarations).
Jan
On Mon, Sep 28, 2020 at 05:15:16PM +0200, Jan Kratochvil wrote:
On Mon, 28 Sep 2020 12:42:33 +0200, Jakub Jelinek wrote:
On Mon, Sep 28, 2020 at 12:31:59PM +0200, Mark Wielaard wrote:
Finally I am interested in your proposal to implement a different way to reduce the size of DIE trees by eliminating "unused" DIEs. It is hard to predict what effect that would have without seeing an implementation (in theory GCC with LTO would not actually generate debuginfo for unused functions). But I think that can be done separate from your proposal and combined with other size reduction techniques.
And note that GCC already does implement -feliminate-unused-debug-{symbols,types} which are enabled by default and are (at least in my eyes) sometimes too aggressive, so by eliminating even further DIEs the debug experience might be even worse.
git clone git://git.jankratochvil.net/massrebuild ./dwarfredundant lldb-debuginfo-11.0.0-0.2.rc3.fc34.x86_64/usr/lib/debug/usr/lib64/liblldb.so.11.0.0-11.0.0-0.2.rc3.fc34.x86_64.debug
So, was this compiled by GCC or clang? If GCC, I don't see how one could end up with low_pc of 0 unless it is a comdat function where there is a DIE from the TU that actually was selected by the linker, or some other copy that wasn't selected. A way out of this could be either to use comdat .debug_info etc. sections (but that would result in quite large increase of *.o file sizes), or let the linker or a tool like DWZ discard or simplify such DIEs. I don't see how could you see at compile time that the linker will not choose the particular copy.
Jakub
On Mon, 28 Sep 2020 17:35:26 +0200, Jakub Jelinek wrote:
So, was this compiled by GCC or clang?
Fedora Koji package: lldb-debuginfo-11.0.0-0.2.rc3.fc34.x86_64
GNU GIMPLE 10.2.1 20200916 (Red Hat 10.2.1-4) -m64 -mtune=generic -march=x86-64 -g -g -g -O2 -O2 -O2 -O2 -fno-openmp -fno-openacc -fcf-protection=none -ffat-lto-objects -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables -fstack-clash-protection -fPIC -ffunction-sections -fdata-sections -fltrans -fplugin=annobin
A way out of this could be either to use comdat .debug_info etc. sections (but that would result in quite large increase of *.o file sizes), or let the linker or a tool like DWZ discard or simplify such DIEs. I don't see how could you see at compile time that the linker will not choose the particular copy.
Another option is to use clang which should have such optimization implemented soon: https://whova.com/embedded/session/llvm_202010/1193947/
Jan
On Mon, Sep 28, 2020 at 05:46:08PM +0200, Jan Kratochvil wrote:
A way out of this could be either to use comdat .debug_info etc. sections (but that would result in quite large increase of *.o file sizes), or let the linker or a tool like DWZ discard or simplify such DIEs. I don't see how could you see at compile time that the linker will not choose the particular copy.
Another option is to use clang which should have such optimization implemented soon: https://whova.com/embedded/session/llvm_202010/1193947/
If you do it on the compiler side, you'll get a lot of those pesky partial units you so hate on the lldb side.
Jakub
On Mon, 28 Sep 2020 17:58:58 +0200, Jakub Jelinek wrote:
On Mon, Sep 28, 2020 at 05:46:08PM +0200, Jan Kratochvil wrote:
If you do it on the compiler side, you'll get a lot of those pesky partial units you so hate on the lldb side.
There are many ways how clang could implement it. I have no idea how is the draft implemented but unless you have more information I do not think they would use DW_TAG_partial_unit.
Jan
On Mon, Sep 28, 2020 at 08:29:21PM +0200, Jan Kratochvil wrote:
On Mon, 28 Sep 2020 17:58:58 +0200, Jakub Jelinek wrote:
On Mon, Sep 28, 2020 at 05:46:08PM +0200, Jan Kratochvil wrote:
If you do it on the compiler side, you'll get a lot of those pesky partial units you so hate on the lldb side.
There are many ways how clang could implement it. I have no idea how is the draft implemented but unless you have more information I do not think they would use DW_TAG_partial_unit.
There aren't that many ways actually, if no special linker assistence is required, then the different DIEs need to go into separate units, and compile units wouldn't be really appropriate, as they should match the original source TUs. On the other side, doing this in DWZ should be pretty easy.
Jakub
On Mon, 28 Sep 2020 17:58:58 +0200, Jakub Jelinek wrote:
On Mon, Sep 28, 2020 at 05:46:08PM +0200, Jan Kratochvil wrote:
On Mon, 28 Sep 2020 17:35:26 +0200, Jakub Jelinek wrote:
A way out of this could be either to use comdat .debug_info etc. sections (but that would result in quite large increase of *.o file sizes), or let the linker or a tool like DWZ discard or simplify such DIEs. I don't see how could you see at compile time that the linker will not choose the particular copy.
Another option is to use clang which should have such optimization implemented soon: https://whova.com/embedded/session/llvm_202010/1193947/
If you do it on the compiler side, you'll get a lot of those pesky partial units you so hate on the lldb side.
No. The LLVM patches from Sony are using COMDAT groups you mentioned above: https://youtu.be/oSCbzLC46Vg?t=312
Jan
On Mon, Sep 28, 2020 at 05:46:08PM +0200, Jan Kratochvil wrote:
On Mon, 28 Sep 2020 17:35:26 +0200, Jakub Jelinek wrote:
So, was this compiled by GCC or clang?
Fedora Koji package: lldb-debuginfo-11.0.0-0.2.rc3.fc34.x86_64
GNU GIMPLE 10.2.1 20200916 (Red Hat 10.2.1-4) -m64 -mtune=generic -march=x86-64 -g -g -g -O2 -O2 -O2 -O2 -fno-openmp -fno-openacc -fcf-protection=none -ffat-lto-objects -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables -fstack-clash-protection -fPIC -ffunction-sections -fdata-sections -fltrans -fplugin=annobin
Note that you are using -ffunction-sections together with -flto. With -flto you don't need -ffunction-sections.
-ffunction sections might cause functions to be dropped by the linker without updating the DWARF DIEs, causing things like a zero DW_AT_low_pc.
Just using -flto should not cause such issues.
Cheers,
Mark
On Tue, 29 Sep 2020 22:31:28 +0200, Mark Wielaard wrote:
Note that you are using -ffunction-sections together with -flto. With -flto you don't need -ffunction-sections.
-ffunction sections might cause functions to be dropped by the linker without updating the DWARF DIEs, causing things like a zero DW_AT_low_pc.
Just using -flto should not cause such issues.
Thanks for this investigation. You are right in gcc -flto binaries I cannot find these dead DIEs.
Found C++ with LTO in libabigail, libreoffice, powertop. Surprisingly gcc has LTO turned off.
Jan
On Mon, 28 Sep 2020 12:31:59 +0200, Mark Wielaard wrote:
If you want to make -fdebug-types-sections the default you really should work with the upstream GCC developers to figure out why they don't want that.
I haven't seen that, according to Richard Biener from GCC -fdebug-types-section is a normally supported GCC feature: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88878#c6 I am aware only of Jakub Jelinek who is against -fdebug-types-section. I should also state he is the author of DWZ.
If GCC is unable to support such a trivial feature as -fdebug-types-section then Fedora should really already switch to the standard compiler. It will come sooner or later anyway. This deviation from standard tools just causes continuous troubles such as: [fesco] Issue #2020: Firefox is switching from gcc to clang/llvm https://pagure.io/fesco/issue/2020#comment-671672
Trying to override upstream defaults in Fedora without understanding why upstream decided on the current defaults isn't a good idea IMHO.
You know very well Fedora already overrides upstream GCC defaults a lot: -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection
And DWZ is even a project unrelated to GCC so calling for upstream defaults would really call for dropping DWZ.
I totally get that it is frustrating if you worked for a long time on a new feature to support some DWARF constructs for lldb
It is in no way a feature as it does not bring any user visible improvement. It is only a compatibility with marginally (0.1%) used file format with disputable benefits.
and your aren't able to get the patches in shape to be accepted upstream.
That is a repeated lie I have never even suggested. LLDB is fine accepting my DWZ patches. I understand you are used to difficulty of upstreaming patches in GNU projects but that is not the case of LLVM. In fact LLDB is a completely different world in accepting patches than GDB where it was taking me even 10+ years to get some patches accepted. For GDB you need to learn first how to do the ancient ineffective and bug-prone coding practices, force yourself to really execute them to become a global maintainer and only then you manage to get patches checked in.
You are repeatedly trying to make it look as the problem is upstreaming DWZ support. That is not any problem. The problem is the DWZ itself as it just isn't worth the effort of supporting it in all the consumers.
As I am repeating again and again I just find DWZ too complicated for both production and consumption for so little gain (size reduction) it achieves. So before I complicate the LLDB codebase by the DWZ support and make it a catch-up game for Red Hat, Fedora and others forever (as Apple+Google is never going to use DWZ and they know why) I am trying by this Change to save a lot of time for everyone.
The years of engineering time I have already spent on DWZ and the years of engineering time I will have to spend on its future maintenance and reworks (for clang DWARF) could be better spent on improving the debugging experience. We are no longer living in 80es where few saved bytes of size were critical whether the debugger will be able to run or not. Apparently GNU developers still haven't realized that change.
But I am happy people now know about your patches and seem to find them useful.
"useful" means with my patches they can workaround the Fedora problem of encumbering its debuginfo by DWZ.
And this question is not about the existing patches. That waste of engineering time has been already done. It is about the future waste of time maintaining compatibility with the DWZ format almost nobody (0.1%) cares about.
You say it is difficult to support DWARF partial units as generated by dwz in lldb, but dwz doesn't really do anything non-standard (and GCC with LTO also generates partial units).
"standard" means that the DWZ specification has been accepted into DWARF-5 standard. IMO that was a mistake. It may have happened because the DWZ author is a member of the DWARF standard committee. Apparently nobody has so far implemented any reasonable/effective consumer for DWARF-5 DWZ otherwise they would put some restrictions into the standard.
To make DWZ better consumable it needs to have the partial units separately parseable. That way they can be shared at IR level and not just at DWARF level That means: * DW_TAG_partial_unit should have DW_AT_language. * DW_TAG_partial_unit must contain only types (struct/class). Currently they contain for example also static constant variables but when you parse such independent DW_TAG_partial_unit into which dictionary you will register such variable? That makes no sense. Currently DWZ has benefits only with DWZ common file. Without DWZ common files DWZ produces 1.6% files bigger than -fdebug-types-section. Therefore if the DWZ common files saving of 3.3% per Fedora distribution size is worth it then the existing DWZ tool should be dropped anyway as one can use normal -fdebug-types-section and one can just write a simple tool moving DW_UT_type units to the DWZ common file and converting DW_AT_signature declarations to DW_FORM_ref_sup4 and we are done.
DWARF standard sometimes makes mistakes, for example .debug_pubnames and .debug_pubtypes were never really usable and DWARF-5 removed them. It may be perfectly possible the DWZ extension of DWARF-5 will be removed in DWARF-6 or future DWARF standards as it turns out it is not worth the format complexity. That some text has been accepted into the DWARF standard does not mean much.
It is all about engineering effort. I agree if the support of DWZ was trivial (or there were unlimited engineering resources) then DWZ is really better than -fdebug-types-section (except it would need a DWZ tool with less bugs and better coding practices). But reality shows the DWZ support is not trivial engineering resources for Fedora are very limited and so we have to decide whether the serious effort to support DWZ is better spent on DWZ or on making the debuggers better/really usable.
Given how much you propose DWZ apparently the 3.3% Fedora distribution size increase (if DWZ is dropped) is considered as too much. Obviously if the size increase was just let's say 0.1% it would be acceptable to drop DWZ - do you agree? So where is your size limit where the years-effort of supporting DWZ is worth it? I would say my limit would be maybe 20%, far above the 3.3%.
For example during Fedora Package Review Process do some packages get rejected because they would make the distribution too large? Not worth of including such new package? I am not aware of such decision and it even sounds funny to me. But that is what you choose here by enforcing DWZ no matter how little savings it has.
DWZ is a nice engineering idea and a nice engineering challenge. But it has no meaning for end-users. This is why I have switched from GDB to LLVM as Google&Apple are solving real end-user problems and not artificial engineering challenges just to have some nice coding time. But in the end I end up stuck on another GNU non-sense (this time DWZ) needing to be supported by LLVM in Fedora only for compatibility reasons.
If your complaint is that partial unit DIEs are missing some for your use case essential attributes,
No, my complaint is that DWZ is just not worth it.
I do find your statistics per package useful because they show dwz is in general effective by producing at least 20% (more) on-disk size reduction, even though there are some packages where dwz doesn't seem as effective as it could be. We definitely should investigate those issues.
And does the 20% reduction of installed size when whole *-debuginfo.rpms get installed is really worth the delay of 3.5 years of DWARF-5, delay of 3.5 years of LLDB index (.debug_names), years of incompatible LLVM and years of wasted engineering time?
I do not think so. Maybe Fedora/FESCo thinks differently and this is what I am asking by my Change about.
But I don't really understand why you then focus on the zstd compressed rpms (even if even those favor dwz).
I do not see how it favors dwz, I haven't seen and I haven't done a measurement of separate *.debug files compression of DWZ vs. -fdebug-types-section. My guess is there will not be a big difference for DWZ vs. -fdebug-types-section size ratio.
Or why for the debuginfod use case you seem to do the opposite, not take into account that the http debuginfod server will compress the files before sending over the network.
See the paragraph above.
-fdebug-types-section has better compression ratio than DWZ for *-debuginfo.rpm because for *.rpm the compression is applied for all its files together. The compression algorithm then finds the same parts of separate *.debug files similar to what DWZ does.
I don't think either of those later statistics are really relevant with respect to your proposal.
I am not sure what do you exactly mean. You can update the Wiki if you disagree with any numbers there.
Finally I am interested in your proposal to implement a different way to reduce the size of DIE trees by eliminating "unused" DIEs. It is hard to predict what effect that would have without seeing an implementation (in theory GCC with LTO would not actually generate debuginfo for unused functions).
It is fastest to wait a few days for the LLVM presentation: https://whova.com/embedded/session/llvm_202010/1193947/ 2) Fragmenting the DWARF to Enable Dead Debug Data Elimination
But I think that can be done separate from your proposal and combined with other size reduction techniques.
By that LLVM dead DIES reduction talk I wanted to just show apparent nobody cares too much about few percents of the *-debuginfo.rpm size - otherwise it would be already coded/used. Or if there wasn't DWZ Fedora could already switch to DWARF-5 which saves probably more size than DWZ does.
A year ago I was talking about DWZ incompatibility with DWARF-5. As nobody has done anything in the past year I was going to compare DWZ + DWARF-4 against -fdebug-types-section + DWARF-5 which would be an easy with for -fdebug-types-section + DWARF-5. Unfortunately when I started proposing to drop DWZ due to it you started to resurrect the DWZ tool by porting it to DWARF-5. I do not find that a good idea longterm.
Jan
Hi Jan,
On Mon, Sep 28, 2020 at 04:50:59PM +0200, Jan Kratochvil wrote:
To make DWZ better consumable it needs to have the partial units separately parseable. That way they can be shared at IR level and not just at DWARF level That means:
- DW_TAG_partial_unit should have DW_AT_language.
- DW_TAG_partial_unit must contain only types (struct/class). Currently they contain for example also static constant variables but when you parse such independent DW_TAG_partial_unit into which dictionary you will register such variable? That makes no sense.
You might want to look at the experiments to do something like that from Tom de Vries: https://sourceware.org/pipermail/dwz/2020q1/000579.html https://sourceware.org/pipermail/dwz/2020q1/000568.html
It is all about engineering effort. I agree if the support of DWZ was trivial (or there were unlimited engineering resources) then DWZ is really better than -fdebug-types-section (except it would need a DWZ tool with less bugs and better coding practices). But reality shows the DWZ support is not trivial engineering resources for Fedora are very limited and so we have to decide whether the serious effort to support DWZ is better spent on DWZ or on making the debuggers better/really usable.
Hacking on dwz and supporting partial units and DWARF supplemential files in debugger like tools isn't trivial. But it is IMHO also not such a big effort that we have to drop everything else. Lets see if we can work together on this.
Cheers,
Mark
On Tue, 29 Sep 2020 22:35:14 +0200, Mark Wielaard wrote:
On Mon, Sep 28, 2020 at 04:50:59PM +0200, Jan Kratochvil wrote:
- DW_TAG_partial_unit should have DW_AT_language.
- DW_TAG_partial_unit must contain only types (struct/class). Currently they contain for example also static constant variables but when you parse such independent DW_TAG_partial_unit into which dictionary you will register such variable? That makes no sense.
You might want to look at the experiments to do something like that from Tom de Vries: https://sourceware.org/pipermail/dwz/2020q1/000579.html
This is another extension of the DWZ tool and tags. The goal is to make the file format (and tooling) more fast and simple, not more slow and complex.
This looks as the DWZ DW_AT_language problem #1 fix listed above.
Hacking on dwz and supporting partial units and DWARF supplemential files in debugger like tools isn't trivial. But it is IMHO also not such a big effort that we have to drop everything else.
So why is Fedora stuck for 3.5 years on DWARF-4? I would switch Fedora to DWARF-5 long time ago but I could not as there isn't anyone willing to work on DWZ. Even now you want to port it to DWARF-5 but nothing more. It needs also to * fix bugs * support LLVM DWARF-5 * support .debug_names * support -fdebug-types-section (for reduction of size already during linking) * with all the effort it is a pity it gives up on large debuginfos as it would run out of memory
DWZ could be a nice tool (not really critical but an interesting challenge) but it is not important enough to make people find time working on it.
Jan
On 9/28/20 8:50 AM, Jan Kratochvil wrote:
On Mon, 28 Sep 2020 12:31:59 +0200, Mark Wielaard wrote:
If you want to make -fdebug-types-sections the default you really should work with the upstream GCC developers to figure out why they don't want that.
I haven't seen that, according to Richard Biener from GCC -fdebug-types-section is a normally supported GCC feature: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88878#c6 I am aware only of Jakub Jelinek who is against -fdebug-types-section. I should also state he is the author of DWZ.
-fdebug-types-section a supported option in the sense that it's in the compiler and we'll fix bugs in it when we can. But the GCC community doesn't really test that option and it's known to be broken with LTO. With the move to LTO for F33 we'd need to fix whatever is broken with -fdebug-types-section (and I don't know precisely what that is) before we could even consider turning it on.
Jakub has a ton of experience with debug info generation in GCC as well as with the dwarf standardization process and based on those experiences he thinks dwz is a technically better solution. You disagree. I would strongly advise against hinting that Jakub is biased against -fdebug-types-section simply because he is the author of dwz. That just makes you look combative.
If GCC is unable to support such a trivial feature as -fdebug-types-section then Fedora should really already switch to the standard compiler. It will come sooner or later anyway. This deviation from standard tools just causes continuous troubles such as: [fesco] Issue #2020: Firefox is switching from gcc to clang/llvm https://pagure.io/fesco/issue/2020#comment-671672
I don't think it argues for that *at all*. Not even close.
I do think we want to bring GCC and LLVM to parity and remove the Fedora policy which favors GCC. I've still got a TODO to write the actual changes for the Fedora packaging guidelines to make that happen.
Trying to override upstream defaults in Fedora without understanding why upstream decided on the current defaults isn't a good idea IMHO.
You know very well Fedora already overrides upstream GCC defaults a lot: -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection
And, not surprisingly, our team has had significant input on the options we're using *and* the GCC implementation of those options. We make recommendations based on our experiences. That same experience would lead us to recommending against -fdebug-types-section at this time.
And DWZ is even a project unrelated to GCC so calling for upstream defaults would really call for dropping DWZ.
Again, I don't see that at all. dwz is a separate and distinct step in the build process. It's not overriding anything in the compiler at all.
I totally get that it is frustrating if you worked for a long time on a new feature to support some DWARF constructs for lldb
It is in no way a feature as it does not bring any user visible improvement. It is only a compatibility with marginally (0.1%) used file format with disputable benefits.
and your aren't able to get the patches in shape to be accepted upstream.
That is a repeated lie I have never even suggested. LLDB is fine accepting my DWZ patches. I understand you are used to difficulty of upstreaming patches in GNU projects but that is not the case of LLVM. In fact LLDB is a completely different world in accepting patches than GDB where it was taking me even 10+ years to get some patches accepted. For GDB you need to learn first how to do the ancient ineffective and bug-prone coding practices, force yourself to really execute them to become a global maintainer and only then you manage to get patches checked in.
You are repeatedly trying to make it look as the problem is upstreaming DWZ support. That is not any problem. The problem is the DWZ itself as it just isn't worth the effort of supporting it in all the consumers.
But -fdebug-types-section isn't a viable alternative right now IMHO.
As I am repeating again and again I just find DWZ too complicated for both production and consumption for so little gain (size reduction) it achieves. So before I complicate the LLDB codebase by the DWZ support and make it a catch-up game for Red Hat, Fedora and others forever (as Apple+Google is never going to use DWZ and they know why) I am trying by this Change to save a lot of time for everyone.
The years of engineering time I have already spent on DWZ and the years of engineering time I will have to spend on its future maintenance and reworks (for clang DWARF) could be better spent on improving the debugging experience. We are no longer living in 80es where few saved bytes of size were critical whether the debugger will be able to run or not. Apparently GNU developers still haven't realized that change.
dwz fixed a real problem for real customers. If you're going to suggest replacing it, you have to propose something that continues to solve those problems and hopefully brings them additional benefits. I realize that's a RHEL issue and that Fedora may ultimately want to do something different. But implying that the problem isn't real is just silly.
In fact, if you look at some packages in Fedora, their debuginfo is so large that they actually turn off debug symbols or throttle back the level of debugging information they generate so that they can successfully link.
But I am happy people now know about your patches and seem to find them useful.
"useful" means with my patches they can workaround the Fedora problem of encumbering its debuginfo by DWZ.
And this question is not about the existing patches. That waste of engineering time has been already done. It is about the future waste of time maintaining compatibility with the DWZ format almost nobody (0.1%) cares about.
This is bordering on flame-bait.
You say it is difficult to support DWARF partial units as generated by dwz in lldb, but dwz doesn't really do anything non-standard (and GCC with LTO also generates partial units).
"standard" means that the DWZ specification has been accepted into DWARF-5 standard. IMO that was a mistake. It may have happened because the DWZ author is a member of the DWARF standard committee. Apparently nobody has so far implemented any reasonable/effective consumer for DWARF-5 DWZ otherwise they would put some restrictions into the standard.
[ ... ]
I'm happy to see some concrete suggestions for fixing dwz. But I find the overall tone of this message fairly combative. I hope that's not your intention.
To make DWZ better consumable it needs to have the partial units separately parseable. That way they can be shared at IR level and not just at DWARF level That means:
- DW_TAG_partial_unit should have DW_AT_language.
- DW_TAG_partial_unit must contain only types (struct/class). Currently they contain for example also static constant variables but when you parse such independent DW_TAG_partial_unit into which dictionary you will register such variable? That makes no sense.
Currently DWZ has benefits only with DWZ common file. Without DWZ common files DWZ produces 1.6% files bigger than -fdebug-types-section. Therefore if the DWZ common files saving of 3.3% per Fedora distribution size is worth it then the existing DWZ tool should be dropped anyway as one can use normal -fdebug-types-section and one can just write a simple tool moving DW_UT_type units to the DWZ common file and converting DW_AT_signature declarations to DW_FORM_ref_sup4 and we are done.
It would certainly be good to improve the on-disk distro size. 99% of the time nobody cares about debugging system bits. But we can't take a major regression in the debuginfo size to achieve that and -fdebug-types-section isn't functional for Fedora. So the only paths forward I see are to either fix -fdebug-types-section or improve dwz.
DWARF standard sometimes makes mistakes, for example .debug_pubnames and .debug_pubtypes were never really usable and DWARF-5 removed them. It may be perfectly possible the DWZ extension of DWARF-5 will be removed in DWARF-6 or future DWARF standards as it turns out it is not worth the format complexity. That some text has been accepted into the DWARF standard does not mean much.
Sure. We all make mistakes and standards bodies are no different. If you feel strongly that it was a mistake, then get involved in the process and try to correct it.
It is all about engineering effort. I agree if the support of DWZ was trivial (or there were unlimited engineering resources) then DWZ is really better than -fdebug-types-section (except it would need a DWZ tool with less bugs and better coding practices). But reality shows the DWZ support is not trivial engineering resources for Fedora are very limited and so we have to decide whether the serious effort to support DWZ is better spent on DWZ or on making the debuggers better/really usable.
Given how much you propose DWZ apparently the 3.3% Fedora distribution size increase (if DWZ is dropped) is considered as too much. Obviously if the size increase was just let's say 0.1% it would be acceptable to drop DWZ - do you agree? So where is your size limit where the years-effort of supporting DWZ is worth it? I would say my limit would be maybe 20%, far above the 3.3%.
Putting my Red Hat hat on, I get pushback from PM on *any* size increases in the RHEL space. While I often question the technical reasons behind why our customers are pushing for marginal (IMO) improvements, reality is they care. As much as we'd like to be in a world were a 1% increase in distro size doesn't matter, that's not the actual world we live in.
And our RHEL customers absolutey do care about the size of debuginfo becuause it affects link times.
Anyway, I'll put my Fedora hat back on :-)
For example during Fedora Package Review Process do some packages get rejected because they would make the distribution too large? Not worth of including such new package? I am not aware of such decision and it even sounds funny to me. But that is what you choose here by enforcing DWZ no matter how little savings it has.
I'm not involved in the new package review process. But I think you're mixing apples and potatoes here. You're talking about a change which will increase the size of every package. While a new package typically only shows up for people that actually install it. The two really aren't comparable.
DWZ is a nice engineering idea and a nice engineering challenge. But it has no meaning for end-users. This is why I have switched from GDB to LLVM as Google&Apple are solving real end-user problems and not artificial engineering challenges just to have some nice coding time. But in the end I end up stuck on another GNU non-sense (this time DWZ) needing to be supported by LLVM in Fedora only for compatibility reasons.
If your complaint is that partial unit DIEs are missing some for your use case essential attributes,
No, my complaint is that DWZ is just not worth it.
Your opinion. Right now mine is that it is very much worth it. That may change one day, but right now dwz is the best solution we've got.
I do find your statistics per package useful because they show dwz is in general effective by producing at least 20% (more) on-disk size reduction, even though there are some packages where dwz doesn't seem as effective as it could be. We definitely should investigate those issues.
And does the 20% reduction of installed size when whole *-debuginfo.rpms get installed is really worth the delay of 3.5 years of DWARF-5, delay of 3.5 years of LLDB index (.debug_names), years of incompatible LLVM and years of wasted engineering time?
I do not think so. Maybe Fedora/FESCo thinks differently and this is what I am asking by my Change about.
I can't speak for FESCO.
But I don't really understand why you then focus on the zstd compressed rpms (even if even those favor dwz).
I do not see how it favors dwz, I haven't seen and I haven't done a measurement of separate *.debug files compression of DWZ vs. -fdebug-types-section. My guess is there will not be a big difference for DWZ vs. -fdebug-types-section size ratio.
Or why for the debuginfod use case you seem to do the opposite, not take into account that the http debuginfod server will compress the files before sending over the network.
See the paragraph above.
-fdebug-types-section has better compression ratio than DWZ for *-debuginfo.rpm because for *.rpm the compression is applied for all its files together. The compression algorithm then finds the same parts of separate *.debug files similar to what DWZ does.
ACK. But again until -fdebug-types-section works with LTO, it's a non-starter IMHO.
Finally I am interested in your proposal to implement a different way to reduce the size of DIE trees by eliminating "unused" DIEs. It is hard to predict what effect that would have without seeing an implementation (in theory GCC with LTO would not actually generate debuginfo for unused functions).
It is fastest to wait a few days for the LLVM presentation: https://whova.com/embedded/session/llvm_202010/1193947/ 2) Fragmenting the DWARF to Enable Dead Debug Data Elimination
Killing dead DIEs is good and doing it as early in the overall pipeline is better.
But I think that can be done separate from your proposal and combined with other size reduction techniques.
I agree with Mark here. IT's an independent improvement.
By that LLVM dead DIES reduction talk I wanted to just show apparent nobody cares too much about few percents of the *-debuginfo.rpm size - otherwise it would be already coded/used. Or if there wasn't DWZ Fedora could already switch to DWARF-5 which saves probably more size than DWZ does.
I wasn't even aware we had dead dies. THat doesn't mean I don't care, it just means I wasn't aware. Others may be aware, but have higher priority items on their TODO lists. Stating that "nobody cares too much ..." isn't helpful.
Jeff
On Tue, Sep 29, 2020 at 7:32 PM Jeff Law law@redhat.com wrote:
On 9/28/20 8:50 AM, Jan Kratochvil wrote:
DWARF standard sometimes makes mistakes, for example .debug_pubnames and .debug_pubtypes were never really usable and DWARF-5 removed them. It may be perfectly possible the DWZ extension of DWARF-5 will be removed in DWARF-6 or future DWARF standards as it turns out it is not worth the format complexity. That some text has been accepted into the DWARF standard does not mean much.
Sure. We all make mistakes and standards bodies are no different. If you feel strongly that it was a mistake, then get involved in the process and try to correct it.
It is all about engineering effort. I agree if the support of DWZ was trivial (or there were unlimited engineering resources) then DWZ is really better than -fdebug-types-section (except it would need a DWZ tool with less bugs and better coding practices). But reality shows the DWZ support is not trivial engineering resources for Fedora are very limited and so we have to decide whether the serious effort to support DWZ is better spent on DWZ or on making the debuggers better/really usable.
Given how much you propose DWZ apparently the 3.3% Fedora distribution size increase (if DWZ is dropped) is considered as too much. Obviously if the size increase was just let's say 0.1% it would be acceptable to drop DWZ - do you agree? So where is your size limit where the years-effort of supporting DWZ is worth it? I would say my limit would be maybe 20%, far above the 3.3%.
Putting my Red Hat hat on, I get pushback from PM on *any* size increases in the RHEL space. While I often question the technical reasons behind why our customers are pushing for marginal (IMO) improvements, reality is they care. As much as we'd like to be in a world were a 1% increase in distro size doesn't matter, that's not the actual world we live in.
And our RHEL customers absolutey do care about the size of debuginfo becuause it affects link times.
Anyway, I'll put my Fedora hat back on :-)
I feel like it's worth giving my perspective here as someone who has done similar work in other distributions.
Because of how little debuginfo is used *in general*, every bit of space saving matters. The dwz technology took so long to be adopted because the rpm support never made it upstream. It took some poking and prodding to finally get it upstreamed for RPM 4.14, and in doing so, distros started using it because they were comfortable with relying on the feature.
I don't think people realize how scary the debuginfo code actually is for us "plebs" wanting to support it while having no knowledge or hope of understanding the depths of it all. It's a very esoteric system. Most of the distributions were unwilling to pull in an out-of-tree patch out of fear of being stuck supporting it alone. For RPM distributions, once it was mainlined, that fear was gone. The Debian distribution family had the benefit of being able to reimplement the whole system initially based on how it works in Fedora for them, and once they did that, it was inherited by that entire branch of Linux distributions.
I personally worked on enabling advanced debuginfo generation features in rpm for OpenMandriva when I helped migrate them to RPM 4.14 and DNF[1]. The only real trouble was with Clang being broken with split debuginfo+debugsource with Clang/LLVM 7[2], which was fixed after upgrading to Clang/LLVM 10[3].
The quality of life improvements with advanced rpm debuginfo generation and dwz have made it tremendously easier for me and others to be able to ship debuginfo data for more packages. And as developers *need* that data to be able to... well, debug things, it comes in handy having it everywhere.
I have also helped with bringing this to other distributions, but I feel that my story here with OpenMandriva is particularly important since we seem to be having a tug-of-war between the GCC and LLVM toolchain teams.
[1]: https://www.openmandriva.org/en/news/article/switching-to-rpmv4 [2]: https://github.com/OpenMandrivaSoftware/rpm-openmandriva-setup/commit/a54ffd... [3]: https://github.com/OpenMandrivaSoftware/rpm-openmandriva-setup/commit/5b432a...
[...]
By that LLVM dead DIES reduction talk I wanted to just show apparent nobody cares too much about few percents of the *-debuginfo.rpm size - otherwise it would be already coded/used. Or if there wasn't DWZ Fedora could already switch to DWARF-5 which saves probably more size than DWZ does.
I wasn't even aware we had dead dies. THat doesn't mean I don't care, it just means I wasn't aware. Others may be aware, but have higher priority items on their TODO lists. Stating that "nobody cares too much ..." isn't helpful.
For the record, Mark has started implementing DWARF-5 support in dwz: https://sourceware.org/git/?p=dwz.git;a=log
I think I would rather like to see a Change proposal to switch to DWARF-5 for Fedora 34, especially since it looks like dwz will be ready for it.
-- 真実はいつも一つ!/ Always, there's only one truth!
Hi Neal,
On Tue, 2020-09-29 at 19:59 -0400, Neal Gompa wrote:
For the record, Mark has started implementing DWARF-5 support in dwz: https://sourceware.org/git/?p=dwz.git;a=log
I think I would rather like to see a Change proposal to switch to DWARF-5 for Fedora 34, especially since it looks like dwz will be ready for it.
That is indeed my goal, but I wasn't planning on filing a specific Change Proposal for it.
First because as you observed in the past we did some of these debuginfo things Fedora first and then it took years (!) for some of the default settings we had changed and helper scripts to make it upstream. So I am concentrating on getting everything ready upstream first before making and proposing any changes for Fedora.
Secondly I am hoping that because of the first point the GCC11 for Fedora 34 Change Proposal will simply say "-gdwarf-5 is now the default".
Lastly, and sadly, I find the whole Fedora change proposal debates extremely hostile. They often seem to quickly result in people attacking you because you made a choice to spend time to work on A and not their favorite feature B, and if they cannot have feature B then you should also not spend any more time on A.
So I am happy to describe the work I am doing to try to get DWARF5 the default for GCC11 and by extension for the Fedora 34 default toolchain, but I will mainly do that work upstream and then see whether it is all ready on time to enable it for Fedora 34. But I am not interested in a heated debate on how I should prioritize my time and energy.
= Why DWARF5 for GCC?
- A couple of new tags and attributes make it easier/more accurate to describe some of the newer language features (although most were already covered by various GNU extensions)
- A lot of GNU extensions to DWARF4 have been standardized in DWARF5. By adopting the standardized variant alternative toolchains will hopefully find it easier to support these features.
- The representation of various data structures in DWARF5 is much more efficient causing a 25% on-disk size reduction (before any other compression method) for the .debug sections: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553527.html
= DWARF5 for the (extended) GNU toolchain
- binutils (gas) is responsible for turning part of the assembly produced by the compiler into a line table (.debug_line) and the linker sometimes reads parts of the DWARF (for example when producing warnings about where a symbol was defined). The just released binutils 2.35.1 should have all fixes necessary to support DWARF5.
- gcc needs to use the new gas features. Jakub has a patch (not committed yet): https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553992.html
- gdb seems ready except for one corner case with C++ static member variables in classes. This is because in DWARF5 these are represented not as variables, which might be optimized away when not used. In this case gcc probably shouldn't optimize out the unused variables (or gdb should not depend on being able to show optimized out static member variables). Ongoing debate how to resolve this: https://sourceware.org/bugzilla/show_bug.cgi?id=26525 https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553102.html
- The just released elfutils 0.181 seems to have all needed support, which should cover systemtap, dwarves, perf, systemd, libabigail. More testing ongoing.
- For valgrind I initially wanted to switch the DWARF reader to an external helper program based on elfutils libdw. But to get a solution faster I will tweak the internal reader to deal with just the minimal DWARF5 as generated by gcc for now. I haven't started on this yet.
= DWARF5 for the (Fedora) packaging tools
- rpm debugedit has patches to support DWARF5 but we have to make sure they have testcases to work with gcc: http://lists.rpm.org/pipermail/rpm-maint/2020-August/014833.html
- dwz is seeing active work towards supporting DWARF5: https://sourceware.org/pipermail/dwz/2020q3/000668.html I am hoping that by end of next week we have generic support. That might not do optimal compression yet and will probably need lots of testing (and bug fixing).
= What I am NOT working on
- We'll keep using .gdb_index for now, moving to .debug_names only when that is ready in gdb: https://sourceware.org/pipermail/gdb/2020-September/048879.html
- Optional DWARF5 features like debug-types, forms, operations or index tables only used for split-dwarf by GCC (e.g. DW_FORM_strx, DW_FORM_addrx, DW_FORM_loclistx, DW_FORM_rnglistx, DW_OP_addrx, DW_OP_constx).
- Any other tool, project not mentioned above or other native toolchains like golang, rust, clang/llvm or ocaml. I expect those to simply keep producing DWARF4.
That doesn't mean I won't help with any of the above if others propose to do that work for the various pieces of the toolchain and packaging, but I currently don't have time for it and I don't think it is realistic for the Fedora 34 timeframe.
= Timeline
- I am hoping that all upstream work is ready by end of next month (end of October, start of November). Then we can backport any patches into Fedora for project which haven't had a release yet and start integration testing once GCC11 drops in rawhide (assuming GCC upstream defaults to DWARF5).
On Wed, 30 Sep 2020 14:50:39 +0200, Mark Wielaard wrote:
= What I am NOT working on
[...]
- Any other tool, project not mentioned above or other native toolchains like golang, rust, clang/llvm or ocaml. I expect those to simply keep producing DWARF4.
So because of a DWZ deficiency you want to keep DWARF-5 in clang disabled. Despite clang supports DWARF-5 better and for a longer time than GCC.
Jan
On Wed, 30 Sep 2020 at 09:41, Jan Kratochvil jan.kratochvil@redhat.com wrote:
On Wed, 30 Sep 2020 14:50:39 +0200, Mark Wielaard wrote:
= What I am NOT working on
[...]
- Any other tool, project not mentioned above or other native toolchains like golang, rust, clang/llvm or ocaml. I expect those to simply keep producing DWARF4.
So because of a DWZ deficiency you want to keep DWARF-5 in clang disabled. Despite clang supports DWARF-5 better and for a longer time than GCC.
I did not take it to mean that. I took it to mean that he isn't going to tell other groups what to work on which a change request seems to have become. He instead expects them to keep doing what they are doing if they want versus getting forced to do what he is working on.
Jan _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On Wed, 30 Sep 2020 15:58:51 +0200, Stephen John Smoogen wrote:
On Wed, 30 Sep 2020 at 09:41, Jan Kratochvil jan.kratochvil@redhat.com wrote:
On Wed, 30 Sep 2020 14:50:39 +0200, Mark Wielaard wrote:
= What I am NOT working on
[...]
- Any other tool, project not mentioned above or other native toolchains like golang, rust, clang/llvm or ocaml. I expect those to simply keep producing DWARF4.
So because of a DWZ deficiency you want to keep DWARF-5 in clang disabled. Despite clang supports DWARF-5 better and for a longer time than GCC.
I did not take it to mean that. I took it to mean that he isn't going to tell other groups what to work on which a change request seems to have become. He instead expects them to keep doing what they are doing if they want versus getting forced to do what he is working on.
Currently on files built by clang -gdwarf-5 DWZ will fail: dwz: ./usr/lib64/libmatrix_client.so.0.3.1-0.3.1-2.fc34.x86_64.debug: Unknown debugging section .debug_addr
Which is fine as the file just does not get optimized. But that results in rpm size bigger for clang-built binaries by 31.23% as -fdebug-types-section is not used. If -fdebug-types-section was used for clang-built binaries DWZ would fail a similar way but the size increase would be "just" by 6.78%.
I do not find there much a difference, just stating.
(These percents are relative to total *-debuginfo.rpm size, not to total distribution size.)
Jan
On Fri, 02 Oct 2020 18:11:45 +0200, Jan Kratochvil wrote:
On Wed, 30 Sep 2020 15:58:51 +0200, Stephen John Smoogen wrote:
On Wed, 30 Sep 2020 at 09:41, Jan Kratochvil jan.kratochvil@redhat.com wrote:
On Wed, 30 Sep 2020 14:50:39 +0200, Mark Wielaard wrote:
= What I am NOT working on
[...]
- Any other tool, project not mentioned above or other native toolchains like golang, rust, clang/llvm or ocaml. I expect those to simply keep producing DWARF4.
So because of a DWZ deficiency you want to keep DWARF-5 in clang disabled. Despite clang supports DWARF-5 better and for a longer time than GCC.
I did not take it to mean that. I took it to mean that he isn't going to tell other groups what to work on which a change request seems to have become. He instead expects them to keep doing what they are doing if they want versus getting forced to do what he is working on.
Currently on files built by clang -gdwarf-5 DWZ will fail: dwz: ./usr/lib64/libmatrix_client.so.0.3.1-0.3.1-2.fc34.x86_64.debug: Unknown debugging section .debug_addr
Which is fine as the file just does not get optimized. But that results in rpm size bigger for clang-built binaries by 31.23% as -fdebug-types-section is not used. If -fdebug-types-section was used for clang-built binaries DWZ would fail a similar way but the size increase would be "just" by 6.78%.
I had a wrong idea. For clang -flto the option -fdebug-types-section no longer makes sense as clang produces direct DW_TAG_class references (DW_FORM_ref4).
So DWZ cannot optimize clang output better, DWZ makes sense only for GCC, not clang.
(A bit, DWZ can still find slightly more optimal DW_FORM_* variants than clang, I will check if it cannot be fixed in clang.)
Jan
On 9/30/20 6:50 AM, Mark Wielaard wrote:
Hi Neal,
On Tue, 2020-09-29 at 19:59 -0400, Neal Gompa wrote:
For the record, Mark has started implementing DWARF-5 support in dwz: https://sourceware.org/git/?p=dwz.git;a=log
I think I would rather like to see a Change proposal to switch to DWARF-5 for Fedora 34, especially since it looks like dwz will be ready for it.
That is indeed my goal, but I wasn't planning on filing a specific Change Proposal for it.
First because as you observed in the past we did some of these debuginfo things Fedora first and then it took years (!) for some of the default settings we had changed and helper scripts to make it upstream. So I am concentrating on getting everything ready upstream first before making and proposing any changes for Fedora.
Secondly I am hoping that because of the first point the GCC11 for Fedora 34 Change Proposal will simply say "-gdwarf-5 is now the default".
In that change proposal can you add a sentence or two in the proposal indicating that I will do a test rawhide build with gcc-11 with dwarf5 on by default? My worry is that once Aldy/Andrew's Ranger work has finished that I'll forget the need for dwarf5 testing.
jeff
On 9/29/20 5:59 PM, Neal Gompa wrote:
I feel like it's worth giving my perspective here as someone who has done similar work in other distributions.
Thanks for that viewpoint. As a compiler optimizer junkie, I don't really follow things on the RPM side, so hearing about that process has played out is useful.
For the record, Mark has started implementing DWARF-5 support in dwz: https://sourceware.org/git/?p=dwz.git;a=log
I think I would rather like to see a Change proposal to switch to DWARF-5 for Fedora 34, especially since it looks like dwz will be ready for it.
Yea, I think moving to dwarf-5 would be a good step forward. ISTM we ought to do a test mass build with dwarf-5 by default. I'm currently testing some bits for other GCC developers, but ought to be able to do a dwarf-5 spin in a week or so to get a sense any notable fallout.
jeff
On Wed, 30 Sep 2020 01:31:29 +0200, Jeff Law wrote:
But the GCC community doesn't really test that option and it's known to be broken with LTO.
I haven't seen any GCC PR for -fdebug-types-section being broken with LTO.
During one abigail diff I did not see any difference. I plan to run a full distribution abigail massrebuild+check as stated in the Change to exclude any possible incompatibilities. That would discover unfiled GCC problems with -fdebug-types-section.
Also I do not see why fixing -fdebug-types-section should be anyhow difficult if the compiler can produce -fno-debug-types-section. I can also write postprocessor to get -fdebug-types-section if GCC is unable to do that. That would sure lose the -fdebug-types-section compilation time performance benefits.
And, not surprisingly, our team has had significant input on the options we're using *and* the GCC implementation of those options. We make recommendations based on our experiences. That same experience would lead us to recommending against -fdebug-types-section at this time.
I think I have also some DWARF experience. Could you suggest what is wrong on -fdebug-types-section?
It would certainly be good to improve the on-disk distro size.
OK, going to file a Change to enable gcc -gz option (zlib section compression) as that will reduce on-disk *.debug size by 52.84%! Then we can disable both DWZ and -fdebug-types-section as those become pointless then.
So the only paths forward I see are to either fix -fdebug-types-section or improve dwz.
And obviously much easier is to fix -fdebug-types-section than DWZ (if there are really any bugs in -fdebug-types-section, there are known bugs nobody wants to fix in DWZ).
Putting my Red Hat hat on, I get pushback from PM on *any* size increases in the RHEL space.
When we start talking about RHEL (and CentOS) DWZ is completely pointless then as DWZ there saves only 0.28% of *-debuginfo.rpm (20MB of 7.2GB). Therefore approx. 0.14% of the distribution size.
As much as we'd like to be in a world were a 1% increase in distro size doesn't matter, that's not the actual world we live in.
Unfortunately DWZ cannot decrease RHEL size by that 1%.
And our RHEL customers absolutey do care about the size of debuginfo becuause it affects link times.
System debuginfo format does not affect link times. Using DWZ during linking customer's applications definitely only increases linking time as it is an extra step. Not sure what do you talk about.
Production of debuginfo does affect compilation time but that is unrelated to DWZ. Production of debug info affects linking time only if -gsplit-dwarf is not used. (-gsplit-dwarf still affects linking time but only very little.) But that is all unrelated to DWZ.
Jan
On 9/30/20 3:50 AM, Jan Kratochvil wrote:
On Wed, 30 Sep 2020 01:31:29 +0200, Jeff Law wrote:
But the GCC community doesn't really test that option and it's known to be broken with LTO.
I haven't seen any GCC PR for -fdebug-types-section being broken with LTO.
I'm not aware of one either. But as Jakub has previously pointed out debug-types-section is disabled when LTO is enabled. I don't know the details of why that is done.
During one abigail diff I did not see any difference. I plan to run a full distribution abigail massrebuild+check as stated in the Change to exclude any possible incompatibilities. That would discover unfiled GCC problems with -fdebug-types-section.
Also I do not see why fixing -fdebug-types-section should be anyhow difficult if the compiler can produce -fno-debug-types-section. I can also write postprocessor to get -fdebug-types-section if GCC is unable to do that. That would sure lose the -fdebug-types-section compilation time performance benefits.
And, not surprisingly, our team has had significant input on the options we're using *and* the GCC implementation of those options. We make recommendations based on our experiences. That same experience would lead us to recommending against -fdebug-types-section at this time.
I think I have also some DWARF experience. Could you suggest what is wrong on -fdebug-types-section?
Your best bet is to discuss with Jakub and perhaps Jason. They're far more familiar with the debuginfo generation than I am.
It would certainly be good to improve the on-disk distro size.
OK, going to file a Change to enable gcc -gz option (zlib section compression) as that will reduce on-disk *.debug size by 52.84%! Then we can disable both DWZ and -fdebug-types-section as those become pointless then.
Note Mark's reply in the other thread. Section compression has significant tradeoffs.
So the only paths forward I see are to either fix -fdebug-types-section or improve dwz.
And obviously much easier is to fix -fdebug-types-section than DWZ (if there are really any bugs in -fdebug-types-section, there are known bugs nobody wants to fix in DWZ).
I think you're making an unsubstantiated leap here. Neither of us know what's wrong with GCC LTO and debug-types-section and others are working on dwz.
Putting my Red Hat hat on, I get pushback from PM on *any* size increases in the RHEL space.
When we start talking about RHEL (and CentOS) DWZ is completely pointless then as DWZ there saves only 0.28% of *-debuginfo.rpm (20MB of 7.2GB). Therefore approx. 0.14% of the distribution size.
Umm, we're fighting with PM these days over things in the 10M range. So, no it's not pointless.
As much as we'd like to be in a world were a 1% increase in distro size doesn't matter, that's not the actual world we live in.
Unfortunately DWZ cannot decrease RHEL size by that 1%.
I'm not asking it to. I'm saying that sizes matter, even in cases where you think they shouldn't.
And our RHEL customers absolutey do care about the size of debuginfo becuause it affects link times.
System debuginfo format does not affect link times. Using DWZ during linking customer's applications definitely only increases linking time as it is an extra step. Not sure what do you talk about.
Most customers don't use dwz. But they consume its output for the RPMs that we provide.
Jeff
On Wed, 07 Oct 2020 00:46:24 +0200, Jeff Law wrote:
On 9/30/20 3:50 AM, Jan Kratochvil wrote:
On Wed, 30 Sep 2020 01:31:29 +0200, Jeff Law wrote:
But the GCC community doesn't really test that option and it's known to be broken with LTO.
I haven't seen any GCC PR for -fdebug-types-section being broken with LTO.
I'm not aware of one either. But as Jakub has previously pointed out debug-types-section is disabled when LTO is enabled. I don't know the details of why that is done.
Because Jakub made a mistake and he still has not corrected himself. I have explained it in: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/...
If -fdebug-types-section was disabled in LTO mode then there would not be 6.78% vs. 31.23% difference of Fedora mass rebuild of -fdebug-types-section vs. -fno-debug-types-section. https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/...
Also this simple example shows it is not true: c="static struct C{int a;C(){}} v;";o="-Wall -gdwarf-5 -fdebug-types-section -flto -O2";echo "$c"|gcc -c -o 1.o $o -x c++ -;echo "${c}int main(){}"|gcc -c -o 2.o $o -x c++ -;gcc -o 1 1.o 2.o $o;llvm-dwarfdump 1|grep DW_UT_type 0x00000042: Type Unit: length = 0x0000005b version = 0x0005 unit_type = DW_UT_type abbr_offset = 0x0023 addr_size = 0x08 name = 'C' type_signature = 0x4e76c0dda193eb61 type_offset = 0x0026 (next unit at 0x000000a1)
So the only paths forward I see are to either fix -fdebug-types-section or improve dwz.
And obviously much easier is to fix -fdebug-types-section than DWZ (if there are really any bugs in -fdebug-types-section, there are known bugs nobody wants to fix in DWZ).
I think you're making an unsubstantiated leap here. Neither of us know what's wrong with GCC LTO and debug-types-section
I know what - nothing! Although sure the problem is GCC as its LTO still produces -fdebug-types-section at all. clang LTO ignores -fdebug-types-section as clang already does the DWZ-style class unification itself during lld phase.
This discussion and the state of GCC vs. clang shows me that getting rid of DWZ is less important and as it is more productive for Fedora to rather get rid of GCC with DWZ altogether.
and others are working on dwz.
That is their problem. I am trying to work on things that make sense (but I cannot).
When we start talking about RHEL (and CentOS) DWZ is completely pointless then as DWZ there saves only 0.28% of *-debuginfo.rpm (20MB of 7.2GB). Therefore approx. 0.14% of the distribution size.
Umm, we're fighting with PM these days over things in the 10M range.
So it is better to slow down getting a finally usable debugger by years to save 10MB of distro size? I really do not believe that. :-)
Most customers don't use dwz. But they consume its output for the RPMs that we provide.
They cannot because the LLVM tools Red Hat ships still do not support DWZ.
Jan
On Wed, 07 Oct 2020 09:58:37 +0200, Jan Kratochvil wrote:
On Wed, 07 Oct 2020 00:46:24 +0200, Jeff Law wrote:
On 9/30/20 3:50 AM, Jan Kratochvil wrote:
When we start talking about RHEL (and CentOS) DWZ is completely pointless then as DWZ there saves only 0.28% of *-debuginfo.rpm (20MB of 7.2GB). Therefore approx. 0.14% of the distribution size.
Umm, we're fighting with PM these days over things in the 10M range.
So it is better to slow down getting a finally usable debugger by years to save 10MB of distro size? I really do not believe that. :-)
I think you mean 10MB of normal rpm size. That is important for containers and other small devices. But the 20MB are only *-debuginfo.rpm size, those are only for developer machines. Developer machines are not concerned by 20MB.
Jan
On Tue, Oct 06, 2020 at 04:46:24PM -0600, Jeff Law wrote:
On 9/30/20 3:50 AM, Jan Kratochvil wrote:
On Wed, 30 Sep 2020 01:31:29 +0200, Jeff Law wrote:
But the GCC community doesn't really test that option and it's known to be broken with LTO.
I haven't seen any GCC PR for -fdebug-types-section being broken with LTO.
I'm not aware of one either. But as Jakub has previously pointed out debug-types-section is disabled when LTO is enabled. I don't know the details of why that is done.
Could you suggest what is wrong on -fdebug-types-section?
Your best bet is to discuss with Jakub and perhaps Jason. They're far more familiar with the debuginfo generation than I am.
It didn't used to work with LTO, but some patches have been backported so it at least is passed through now without crashing. There are however still various bugs in the implementation:
Excess debug info -fdebug-types-section https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78320
-fdebug-types-section drops DW_AT_object_pointer https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94875
gcc drops top-level dies with -fdebug-types-section https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90232
Fission + type units + compression are suboptimal https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78321
Cheers,
Mark
On Wed, 30 Sep 2020 01:31:29 +0200, Jeff Law wrote:
-fdebug-types-section a supported option in the sense that it's in the compiler and we'll fix bugs in it when we can. But the GCC community doesn't really test that option and it's known to be broken with LTO.
I believe you base this information on Jakub Jelinek's internal company mail: Message-ID: 20200710092926.GJ2363@tucnak
IIUC that mail contains incorrect information. My apologies if my deduction is incorrect, I am also writing "IIUC" here. I am basing my information on explanation by GCC developer Richard Biener: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88878#c8
It is explained there "in_lto_p" means GCC is in second/later phase of LTO. Not that LTO is enabled at all (as Jakub Jelinek said in the internal mail). Also GCC does not produce ICEs (=compiler crash, (*)) Jakub Jelinek was claiming will happen during the 20000+ packages rebuild with LTO and -fdebug-types-section I have done.
So I really see no indication why GCC would not normally support -fdebug-types-section even with LTO. Also it is so simple optimization of DWARF there is no reason why there should be any longterm issues with it.
Jan
(*) There were 7 packages reproducing GCC crashes due to the following two GCC Bugs specific to -fdebug-types-section. That is unrelated to the topic of the "in_lto_p" condition discussed above. ICE: fortran+gnat: build_abbrev_table, at dwarf2out.c: -g -fdebug-types-section https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96471 ICE: c++: dwarf2out_abstract_function, at dwarf2out.c: -g -fdebug-types-section https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96472
On Mon, 2020-09-28 at 16:50 +0200, Jan Kratochvil wrote:
For example during Fedora Package Review Process do some packages get rejected because they would make the distribution too large? Not worth of including such new package? I am not aware of such decision and it even sounds funny to me. But that is what you choose here by enforcing DWZ no matter how little savings it has.
I'm not aware of this ever happening. What has happened is packages being dropped from the installation images because it makes them too large -- is that perhaps what you're referring to?
Best regards,
On Wed, 30 Sep 2020 23:42:56 +0200, Michel Alexandre Salim wrote:
On Mon, 2020-09-28 at 16:50 +0200, Jan Kratochvil wrote:
For example during Fedora Package Review Process do some packages get rejected because they would make the distribution too large? Not worth of including such new package? I am not aware of such decision and it even sounds funny to me. But that is what you choose here by enforcing DWZ no matter how little savings it has.
I'm not aware of this ever happening. What has happened is packages being dropped from the installation images because it makes them too large -- is that perhaps what you're referring to?
No because installation images never contain debuginfos.
Jan
On Mon, 28 Sep 2020 16:51:02 +0200, Jan Kratochvil wrote:
To make DWZ better consumable it needs to have the partial units separately parseable. That way they can be shared at IR level and not just at DWARF level That means:
- DW_TAG_partial_unit should have DW_AT_language.
- DW_TAG_partial_unit must contain only types (struct/class). Currently they contain for example also static constant variables but when you parse such independent DW_TAG_partial_unit into which dictionary you will register such variable? That makes no sense.
Currently DWZ has benefits only with DWZ common file. Without DWZ common files DWZ produces 1.6% files bigger than -fdebug-types-section. Therefore if the DWZ common files saving of 3.3% per Fedora distribution size is worth it then the existing DWZ tool should be dropped anyway as one can use normal -fdebug-types-section and one can just write a simple tool moving DW_UT_type units to the DWZ common file and converting DW_AT_signature declarations to DW_FORM_ref_sup4 and we are done.
I have measured this "a different common file" for DW_UT_type and it brings 18.14% size reduction of what DWZ does. With additional dropping of dead DIEs + dropping -fdebug-types-section type declaratins which achieves about 28% this makes it together 46% of what DWZ reduces possible without those overcomplicated constructs of DWZ.
Original comparison of plain -fdebug-types-section DWZ makes the Fedora distribution 3.3% smaller.
-fdebug-types-section with some simple optimizations (just reusing existng -fdebug-types-section code in consumers + DWZ common file opening in consumers together with removal of dead DIEs) DWZ is still slightly smaller but already only by 1.8% of the Fedora distribution size.
Sure my proposal does not expect that few percents matter. I have checked further possibilities based on a the mail replies here which seem to insist on any debuginfo size savings.
Jan
On Mon, 28 Sep 2020 12:31:59 +0200, Mark Wielaard wrote:
I do find your statistics per package useful because they show dwz is in general effective by producing at least 20% (more) on-disk size reduction,
I am ignoring the on-disk size, I always measure just *-debuginfo.rpm size.
If anyone is concerned about on-disk size then Fedora should have already enabled zlib section compression which would reduce the on-disk *.debug size by 52.84% for the whole distro. Or even implement zstd section compression probably with even bigger size decrease (and even lower performance hit already low enough).
But Fedora has not enabled transparent zstd filesystem compression for F-33 btrfs by default and nobody enabled zlib for *.debug files (you even implemented decompressing zlib from *.debug files) so apparently nobody / you do not care about the on-disk size.
So why do you mention the on-disk size now?
Jan
Hi Jan,
On Mon, Sep 28, 2020 at 05:28:57PM +0200, Jan Kratochvil wrote:
On Mon, 28 Sep 2020 12:31:59 +0200, Mark Wielaard wrote:
I do find your statistics per package useful because they show dwz is in general effective by producing at least 20% (more) on-disk size reduction,
I am ignoring the on-disk size, I always measure just *-debuginfo.rpm size.
If anyone is concerned about on-disk size then Fedora should have already enabled zlib section compression which would reduce the on-disk *.debug size by 52.84% for the whole distro. Or even implement zstd section compression probably with even bigger size decrease (and even lower performance hit already low enough).
I was just discussing that recently with the Hotspot Perf GUI maintainer. And we concluded that if .debug files would be compressed then we would need an uncompressed cache somewhere. The issue with having the on-disk debuginfo files compressed is that for debugger/tracing/profiling tools it incurs an significant decompression time delay and extra memory usage. Especially for a profiling tool that only needs to quickly get a little information it is much more convenient to be able to simply mmap the .debug file, check the aranges and directly jump to the correct DIE offset. See e.g. https://github.com/KDAB/hotspot/issues/115
So why do you mention the on-disk size now?
Because I believe it is the most important benchmark. The on-disk size is not just the installed file size, but it is also the in memory size of the data structures that need to be stored and parsed. So a 20% smaller on-disk size also (roughly) means 20% less DIEs and abbrevs that need to be parsed or held into memory.
Cheers,
Mark
On Tue, 29 Sep 2020 22:29:44 +0200, Mark Wielaard wrote:
I was just discussing that recently with the Hotspot Perf GUI maintainer. And we concluded that if .debug files would be compressed then we would need an uncompressed cache somewhere. The issue with having the on-disk debuginfo files compressed is that for debugger/tracing/profiling tools it incurs an significant decompression time delay and extra memory usage. Especially for a profiling tool that only needs to quickly get a little information it is much more convenient to be able to simply mmap the .debug file, check the aranges and directly jump to the correct DIE offset. See e.g. https://github.com/KDAB/hotspot/issues/115
First is is a marginal use case. For the GDB popular here I tested zlib on some IIRC 500MB+ .debug file and the startup time was 11.00s->12.45s = +13.20%. Given GDB takes minutes to print something on such .debug files the <2s larger startup does not matter.
And then all this is about zlib compression. Facebook has developed zstd which is much faster. Google says faster than reading the .debug files, on my machine both zstd and NVMe disk read are both 2GB/s. I expect btrfs has even in-memory cache for decompressed files but I have not checked it now as all the numbers I have collected have no effect here anyway.
Because I believe it is the most important benchmark. The on-disk size is not just the installed file size, but it is also the in memory size of the data structures that need to be stored and parsed. So a 20% smaller on-disk size also (roughly) means 20% less DIEs and abbrevs that need to be parsed or held into memory.
The problem is that you have to wait for minutes for GDB to print anything. It is faster to add cout<<, recompile and rerun the program (with clang+lld as with g++ it takes more than 3x as much time) than to wait for GDB. LLDB would sure print it immediately but it is incompatible with Fedora DWARF. Enjoy.
Jan
* Jan Kratochvil:
On Tue, 29 Sep 2020 22:29:44 +0200, Mark Wielaard wrote:
I was just discussing that recently with the Hotspot Perf GUI maintainer. And we concluded that if .debug files would be compressed then we would need an uncompressed cache somewhere. The issue with having the on-disk debuginfo files compressed is that for debugger/tracing/profiling tools it incurs an significant decompression time delay and extra memory usage. Especially for a profiling tool that only needs to quickly get a little information it is much more convenient to be able to simply mmap the .debug file, check the aranges and directly jump to the correct DIE offset. See e.g. https://github.com/KDAB/hotspot/issues/115
First is is a marginal use case.
Why do you think that? Using debuginfo for perf and the like seems to be much more common than actual debugging, based on what I see downstream.
The problem is that you have to wait for minutes for GDB to print anything.
Is this about slow tab completion?
It is faster to add cout<<, recompile and rerun the program (with clang+lld as with g++ it takes more than 3x as much time) than to wait for GDB. LLDB would sure print it immediately but it is incompatible with Fedora DWARF. Enjoy.
I can't use LLDB because it does not support thread-local variables. Not even initial-exec variables, which could be implemented without peeking at glibc internals.
Thanks, Florian
On Mon, 05 Oct 2020 08:28:03 +0200, Florian Weimer wrote:
Why do you think that? Using debuginfo for perf and the like seems to be much more common than actual debugging, based on what I see downstream.
OK, interesting, thanks for the info. Still that does not change anything with btrfs zstd. btrfs zstd would save on-disk size ~50% (I did not measure zstd for Fedora distro, zlib saves on Fedora distro 52%) compared to DWZ's on-disk size saving of 20% (or just 10% if one makes some non-DWZ optimizations in addition to -fdebug-types-section).
The problem is that you have to wait for minutes for GDB to print anything.
Is this about slow tab completion?
No. GDB always converts DWARF->IR (=expands) for whole CUs (Compilation Units). Moreover for some constructs GDB requires complete type definition which makes CUs dependent on other CUs. So in practice accessing one variable will expand around 50 CUs. And each CU is nowadays huge for real C++ templatized programs.
In other cases (such as in lambda functions) GDB even expands completely all CUs. That is probably some GDB bug. But it makes GDB to eat 20+ GB of memory and several minutes of runtime on fast machines to print a variable.
Compared to that LLDB expands only one DIE and it is done. But LLDB needs to know where the DIE is. This is why LLDB needs .debug_names index and not .gdb_index which does not contain DIE offsets. And this is why .gdb_index is easy to update for DWZed DWARF but for .debug_names there is currently even no DWARF specification how to make it compatible with DWZ. DWZ .debug_names index has to express two CUs for each DIE: (1) DW_TAG_partial_unit where the DIE is located in the DWARF file (and whether it is in DWZ common file or not) (2) DW_TAG_compile_unit which did DW_TAG_imported_unit that DW_TAG_partial_unit) LLDB DWZ support had to extend its in-memory index this way but it does waste LLDB memory+runtime on all OSes not using DWZ (=most of the LLDB use - for Android/iOS/OSX).
It is faster to add cout<<, recompile and rerun the program (with clang+lld as with g++ it takes more than 3x as much time) than to wait for GDB. LLDB would sure print it immediately but it is incompatible with Fedora DWARF. Enjoy.
I can't use LLDB because it does not support thread-local variables.
I have this item in my LLDB TODO list (and I even fixed the TLS support in GDB in the past). But without DWARF (DWZ) support it makes no sense to support TLS. So I rather returned to working on the DWZ support saving 1.8% of distribution size wasting 2 years on it and probably going to continue wasting more forthcoming years on those 1.8% of distribution size.
Not even initial-exec variables, which could be implemented without peeking at glibc internals.
Yes, a lot of useful things could be working if we did not have to waste time on useless stuff.
Jan
* Jan Kratochvil:
On Mon, 05 Oct 2020 08:28:03 +0200, Florian Weimer wrote:
Why do you think that? Using debuginfo for perf and the like seems to be much more common than actual debugging, based on what I see downstream.
OK, interesting, thanks for the info. Still that does not change anything with btrfs zstd. btrfs zstd would save on-disk size ~50% (I did not measure zstd for Fedora distro, zlib saves on Fedora distro 52%) compared to DWZ's on-disk size saving of 20% (or just 10% if one makes some non-DWZ optimizations in addition to -fdebug-types-section).
Fedora Server still defaults to LVM and XFS, as far as I know. I expect that downstream will continue to use XFS as well.
I don't think you can assume that debugging information will be stored on btrfs file systems.
Thanks, Florian
On Mon, 05 Oct 2020 09:20:53 +0200, Florian Weimer wrote:
Fedora Server still defaults to LVM and XFS, as far as I know. I expect that downstream will continue to use XFS as well.
I don't think you can assume that debugging information will be stored on btrfs file systems.
OK. Then DWARF consumers could just decompress zstd themselves. Then we can argue about tracing tools overhead. Either cache of uncompressed files or even some little performance hit. Still that would be worth 50% of debuginfo size. And that would be needed only on non-btrfs filesystems which (maybe) will disappear in the future.
Jan
On 10/4/20 2:48 PM, Jan Kratochvil wrote:
On Tue, 29 Sep 2020 22:29:44 +0200, Mark Wielaard wrote:
I was just discussing that recently with the Hotspot Perf GUI maintainer. And we concluded that if .debug files would be compressed then we would need an uncompressed cache somewhere. The issue with having the on-disk debuginfo files compressed is that for debugger/tracing/profiling tools it incurs an significant decompression time delay and extra memory usage. Especially for a profiling tool that only needs to quickly get a little information it is much more convenient to be able to simply mmap the .debug file, check the aranges and directly jump to the correct DIE offset. See e.g. https://github.com/KDAB/hotspot/issues/115
First is is a marginal use case. For the GDB popular here I tested zlib on some IIRC 500MB+ .debug file and the startup time was 11.00s->12.45s = +13.20%. Given GDB takes minutes to print something on such .debug files the <2s larger startup does not matter.
I'm not sure it's that marginal. It may not matter for GDB since I don't think the bottleneck is likely the decompression. It would certainly matter for profiling and the like -- I'd probably argue that dwarf is a terrible choice for those tools, but I don't see CTF as a viable alternative though, so they're stuck in the dwarf world.
And then all this is about zlib compression. Facebook has developed zstd which is much faster. Google says faster than reading the .debug files, on my machine both zstd and NVMe disk read are both 2GB/s. I expect btrfs has even in-memory cache for decompressed files but I have not checked it now as all the numbers I have collected have no effect here anyway.
Please, let's stop talking about btrfs here. It's not useful.
However, I think it's perfectly valid to discuss zstd if folks wanted to change the compression scheme for debug sections. In fact, I'd claim sticking with zlib, gzip, bzip2, xz, 7z, etc is unwise. The world has moved and zstd seems like the place we should be. In fact, we use it for various things within GCC already.
Jeff
Hi,
Changing subject because this has nothing to do with that Change Proposal anymore.
On Tue, Oct 06, 2020 at 01:49:13PM -0600, Jeff Law wrote:
However, I think it's perfectly valid to discuss zstd if folks wanted to change the compression scheme for debug sections. In fact, I'd claim sticking with zlib, gzip, bzip2, xz, 7z, etc is unwise. The world has moved and zstd seems like the place we should be. In fact, we use it for various things within GCC already.
Personally I must admit that I am not really a fan of using ELF section/file compression. It makes it impossible to simply mmap the data in or to quickly read just a tiny bit because you first have to decompress (and allocate new memory) for it. IMHO .debug files are no different from other ELF files for which we would also not do this. We can just use rpm package compression to reduce the distro distribution size, but we should not (re)compress the install/on-disk files. That will just mean programs will create an extra cache of uncompressed files they need to consult frequently.
But if you are going to do it, then it does make sense to use the most efficient algorithm there is. If you are going to experiment with this there are two ways to go about it.
First you can use full file compression. That is actually already (almost transparently) supported for tools based on elfutils libdw when using the dwfl functions. For example eu-readelf (and some other eu- tools) can be run on any compressed file directly (the version in rawhide also supports zstd because the vmlinuz file now uses that). You would have to convince other DWARF consumers to do the same. And decide for those that use .gnu_debuglink instead of build-id lookups (or when build-id lookup fails) whether the filenames should include the compression extension like .zst or if a consumer is responsible for trying all (?) known compression extensions when resolving the .debug file.
Secondly you can use ELF section compression. The ELF spec leaves room for adding new compression algorithms. The Chdr struct(s) contain a ch_type which describes the algorithm. Currently only one is specified, but there is a lot of room for expansion:
/* Legal values for ch_type (compression algorithm). */ #define ELFCOMPRESS_ZLIB 1 /* ZLIB/DEFLATE algorithm. */ #define ELFCOMPRESS_LOOS 0x60000000 /* Start of OS-specific. */ #define ELFCOMPRESS_HIOS 0x6fffffff /* End of OS-specific. */ #define ELFCOMPRESS_LOPROC 0x70000000 /* Start of processor-specific. */ #define ELFCOMPRESS_HIPROC 0x7fffffff /* End of processor-specific. */
So you could propose something on gnu-gabi@sourceware.org for a GNU extension or at generic-abi@googlegroups.com for a generic ELF one. And then get the ELF processing tools to adopt the new compression type.
Cheers,
Mark
On 10/6/20 3:59 PM, Mark Wielaard wrote:
Hi,
Changing subject because this has nothing to do with that Change Proposal anymore.
On Tue, Oct 06, 2020 at 01:49:13PM -0600, Jeff Law wrote:
However, I think it's perfectly valid to discuss zstd if folks wanted to change the compression scheme for debug sections. In fact, I'd claim sticking with zlib, gzip, bzip2, xz, 7z, etc is unwise. The world has moved and zstd seems like the place we should be. In fact, we use it for various things within GCC already.
Personally I must admit that I am not really a fan of using ELF section/file compression. It makes it impossible to simply mmap the data in or to quickly read just a tiny bit because you first have to decompress (and allocate new memory) for it. IMHO .debug files are no different from other ELF files for which we would also not do this. We can just use rpm package compression to reduce the distro distribution size, but we should not (re)compress the install/on-disk files. That will just mean programs will create an extra cache of uncompressed files they need to consult frequently.
I'm not taking a position on whether or not we compress sections. My position is that if we're compressing them, then zstd seems like a better solution than the others mentioned. I certainly understand the desire to just mmap in the stuff and move on.
[ ... ]
Secondly you can use ELF section compression. The ELF spec leaves room
for adding new compression algorithms. The Chdr struct(s) contain a ch_type which describes the algorithm. Currently only one is specified, but there is a lot of room for expansion:
/* Legal values for ch_type (compression algorithm). */ #define ELFCOMPRESS_ZLIB 1 /* ZLIB/DEFLATE algorithm. */ #define ELFCOMPRESS_LOOS 0x60000000 /* Start of OS-specific. */ #define ELFCOMPRESS_HIOS 0x6fffffff /* End of OS-specific. */ #define ELFCOMPRESS_LOPROC 0x70000000 /* Start of processor-specific. */ #define ELFCOMPRESS_HIPROC 0x7fffffff /* End of processor-specific. */
So you could propose something on gnu-gabi@sourceware.org for a GNU extension or at generic-abi@googlegroups.com for a generic ELF one. And then get the ELF processing tools to adopt the new compression type.
ohhh, I didn't know it was baked in at this level. Yea, if we're going to do section compression with zstd, then it's clearly best to get it officially supported at the ABI level.
jeff
I'm missing some good statistics.
- DWZ advantage: On the whole Fedora distro it saves 3.3% (5GB of the
157GB distribution size)
What is this comparing? Is this the size of binary rpm or the installation-on-disk footprint?
I would love to see a comparison of numbers for three things: - raw debuginfo without dwz or -fdebug-types-section - debuginfo with dwz (current approach) - debuginfo with -fdebug-types-section
For each of those three categories both measures (rpm size and on-disk size) would be useful. Could you provide numbers like this for some subset of packages (20-30 packages that produce debuginfo would be enough to get a good measure).
I find that 3.3% number strange — it would mean that dwz is essentially useless, but maybe I'm misunderstanding how it's defined. I think we need to get some better understanding what the effects of various approaches are before discussing which to pick.
Zbyszek
** If the 3.3% size increase is a concern I can implement a different optimization ([https://whova.com/embedded/session/llvm_202010/1193947/ talk (2)]) as a GCC post-processing phase which would require no changes in any DWARF consumers.
- DWZ disadvantage: DWZ has currently less support across consumers
(LLDB, llvm-dwarfdump, binutils readelf)
- DWZ disadvantage: DWZ requires 8x times more complicated (LoC count)
support in consumers than -fdebug-types-section.
- DWZ disadvantage: DWZ cannot update LLVM .debug_names index which
can be generated only by clang (it cannot be regenerated later for DWZ-compressed file)
- DWZ disadvantage: DWZ DWARF-5 support is a work-in-progress. DWZ has
been blocking DWARF-5 for Fedora for 3.5 years and only after I have now proposed to drop DWZ Mark Wielaard has started porting DWZ to DWARF-5. It can be expected next DWARF extensions will remain unsupported again. Even currently there is no plan to support DWARF-5 features used by clang which may need -fdebug-types-section for clang-built binaries or no size optimization of clang-built debug info at all.
- DWZ disadvantage: Compilation (linking) requires for C++ up to 2x as
big disk space (as DWZ is processing files after linker and DWZ is incompatible with -fdebug-types-section)
- DWZ disadvantage: Compilation (linking) is slower
This proposed DWARF format was originally submitted already for Fedora 18 as [[Features/DebugTypesSections]].
== Benefit to Fedora ==
- Better compatibility with existing debugging and tracing tools,
primarily [https://lldb.llvm.org/ LLDB].
- Less resource-intensive rebuilds of C++ packages (in disk space,
memory requirements and compilation time).
== Scope ==
- Proposal owners: It affects all packages generating *-debuginfo.rpm,
that is compiled (not scripted) languages.
- Other developers: Report any possible debuginfo incompatibility (unexpected).
- Release engineering: [https://pagure.io/releng/issues #Releng issue
number] (a check of an impact with Release Engineering is needed)
- Policies and guidelines: All the needed changes should be done in
[https://src.fedoraproject.org/rpms/redhat-rpm-config redhat-rpm-config]. The [https://src.fedoraproject.org/rpms/dwz dwz package] can be then retired.
- Trademark approval: N/A (not needed for this Change)
- Alignment with Objectives: The size differences are only for
*-debuginfo.rpm which is outside of scope of the listed objectives.
== Upgrade/compatibility impact == As *-debuginfo.rpm have to exactly match NVRA of its binary package the compatibility is not relevant. Existing tools supporting DWZ will still support the DWZ file format in packages which have not been rebuilt.
== How To Test == The change will update [https://src.fedoraproject.org/rpms/redhat-rpm-config redhat-rpm-config] by [https://people.redhat.com/jkratoch/redhat-rpm-config-fdebug-types-section.pa... an -fdebug-types-section patch].
Then one can use rpmbuild to rebuild a package. For mock use -a|--addrepo with modified redhat-rpm-config.rpm (with increased NVRA). For packages already rebuilt in Koji nothing is needed.
Test programs like lldb and gdb if they still can print source code, function parameters, variables etc.
One should also verify integrated testsuites of tools like clang, lldb, gcc, binutils, gdb, elfutils or rpm are not regressing with the -fdebug-types-section option.
One can also compare *.debug files built with/without DWZ and/or -fdebug-types-section using [https://src.fedoraproject.org/rpms/libabigail libabigail] utility dwdiff but that will be rather done by the change owner.
== User Experience == No user visible change. This affects what tools can developers use.
== Dependencies == none
== Contingency Plan ==
- Contingency mechanism: Revert the change in
[https://src.fedoraproject.org/rpms/redhat-rpm-config redhat-rpm-config]. Fedora can continue using DWZ, just some debugging/tracing tools will stay incompatible.
- Contingency deadline: beta freeze
- Blocks release? No
- Blocks product? N/A
== Documentation ==
- [http://www.dwarfstd.org/doc/DWARF5.pdf DWARF-5] E.2 Using Type Units
- [https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#index-fdebug-types...
GCC -fdebug-types-section]
-- Ben Cotton He / Him / His Senior Program Manager, Fedora & CentOS Stream Red Hat TZ=America/Indiana/Indianapolis _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On Fri, 25 Sep 2020 12:10:22 +0200, Zbigniew Jędrzejewski-Szmek wrote:
I'm missing some good statistics.
I have 1.6TB of statistics, ask me anything. It is calculated by my scripts: https://git.jankratochvil.net/?p=massrebuild.git;a=tree git clone git://git.jankratochvil.net/massrebuild
- DWZ advantage: On the whole Fedora distro it saves 3.3% (5GB of the
157GB distribution size)
What is this comparing? Is this the size of binary rpm or the installation-on-disk footprint?
I am usually talking about *-debuginfo.rpm size.
Another possible number is separate *.debug files download (DWZ is then 6% bigger than -fdebug-types-section due to the associated DWZ common files).
I would love to see a comparison of numbers for three things:
- raw debuginfo without dwz or -fdebug-types-section
Oops, I do not have this number, I can run new massrebuild, it takes about 4 days (depending on availability of beefy machines).
- debuginfo with dwz (current approach)
rpm size: 35186079102 disk size: 177913332940
- debuginfo with -fdebug-types-section
rpm size: 37570327765 disk size: 214927514757
= DWZ rpm size is smaller by 6.78% = DWZ on-disk size is smaller by 20.8%
It is based on 22080 Fedora Rawhide packages rebuilt on 2020-08-24.
For each of those three categories both measures (rpm size and on-disk size) would be useful.
Another big variable is F-34 should be hopefully in DWARF-5 (F-33 is DWARF-4) which will change the numbers a bit (unaware which way). Currently DWZ is not yet ported to DWARF-5 so there is no way to compare it. Also DWZ does not plan to support LLVM DWARF-5 so that will also skew such comparison even after its port.
For on-disk size it will all get different by F-33 btrfs compression again which should reduce the size by about 50% (which makes any DWZ/-fdebug-types-section differences pointless). It will obviously make the on-disk size difference smaller (than current 20.8%).
And finally on-disk size depends a lot on which *-debuginfo packages you have installed which varies a lot when stddev is twice the average DWZ saving.
Could you provide numbers like this for some subset of packages (20-30 packages that produce debuginfo would be enough to get a good measure).
Problem of these numbers is they depend too much on the chosen set of rpms so 20-30 packages do not say anything. DWZ against -fdebug-types-section saves for whole Rawhide 6.35% size total. When averaged for each package it is 5.44% (that means DWZ saves more on bigger-than-median packages) but stddev of the saving is +/-11%. Packages where -fdebug-types-section is smallest (by difference in bytes): 70.11: julia-1.5.0-1.fc33.src.rpm -fdebug-types-section size=866936043 DWZ size=1236511762 74.43: nodejs-14.7.0-1.fc33.src.rpm -fdebug-types-section size=921485027 DWZ size=1238008099 77.84: mozjs78-78.1.0-1.fc33.src.rpm -fdebug-types-section size=623280098 DWZ size=800743010 Packages where DWZ is smallest (by difference in bytes): 508.93: kea-1.7.9-3.fc33.src.rpm -fdebug-types-section size=1379013840 DWZ size=270963319 143.07: paraview-5.8.1-1.fc33.src.rpm -fdebug-types-section size=11462175974 DWZ size=8011695061 196.49: hpx-1.4.1-4.fc33.src.rpm -fdebug-types-section size=10981369919 DWZ size=5588742102 All these sizes are for *-debuginfo.rpm.
The sizes depend strongly on the chosen subset of packages: For example for ELN-like (*) distro the saving is not 6.35% but only 0.28%. For Fedora 32 packages on my personal machines it is not 6.35% but 0.72%.
(*) I did use Fedora Rawhide subset for packages present in CentOS-8.2.
Also there is an opportunity for new non-DWZ optimization (orthogonal to DWZ/-fdebug-types-section) which can save 5.96% of *-debuginfo.rpm with clang-only draft implementation which requires no DWARF consumers modification and it is easier to implement than to upstream+maintain the DWZ support for LLDB.
I find that 3.3% number strange — it would mean that dwz is essentially useless, but maybe I'm misunderstanding how it's defined.
F-32 x86_64 has 157GB total, debug/ is 82GB (6GB is *-debugsource): 6.35% * (82-6) / 157 = 3.07% approx., the 3.3% was calculated with more exact distro size numbers.
I think we need to get some better understanding what the effects of various approaches are before discussing which to pick.
Thanks for this discussion.
Jan
On Fri, 25 Sep 2020 16:29:26 +0200, Jan Kratochvil wrote:
On Fri, 25 Sep 2020 12:10:22 +0200, Zbigniew Jędrzejewski-Szmek wrote:
- debuginfo with dwz (current approach)
rpm size: 35186079102 disk size: 177913332940
- debuginfo with -fdebug-types-section
rpm size: 37570327765 disk size: 214927514757
= DWZ rpm size is smaller by 6.78% = DWZ on-disk size is smaller by 20.8%
...
Also there is an opportunity for new non-DWZ optimization (orthogonal to DWZ/-fdebug-types-section) which can save 5.96% of *-debuginfo.rpm with
^^^^^^^^^^^^^^^ = on-disk files
clang-only draft implementation which requires no DWARF consumers modification and it is easier to implement than to upstream+maintain the DWZ support for LLDB.
Here I made a mistake. One needs to compare 5.96% to DWZ saving of 20.8% (and not 6.78%). Therefore this "non-DWZ optimization" is not equivalent do DWZ (but it could still save some space). This also only an estimate.
I have updated the wiki page (it does not change much on the proposal).
Jan
On Fri, 25 Sep 2020 16:29:26 +0200, Jan Kratochvil wrote:
On Fri, 25 Sep 2020 12:10:22 +0200, Zbigniew Jędrzejewski-Szmek wrote:
I would love to see a comparison of numbers for three things:
- raw debuginfo without dwz or -fdebug-types-section
Oops, I do not have this number, I can run new massrebuild, it takes about 4 days (depending on availability of beefy machines).
After almost 5 days it finished: rpm size: 45643460133 disk size: 233466873738
= neither compression rpm size is bigger by 31.23%. = neither compression on-disk size is bigger by 29.72%.
It is based on 22126 Fedora Rawhide packages (20810 rebuilt) on 2020-09-30.
That has been arithmetically approximated from: New mass rebuild's no DWZ && no -fdebug-types-section: rpm size: 59324710852 disk size: 299606472067 New mass rebuild's default DWZ: rpm size: 45732816106 disk size: 228314986029
- debuginfo with dwz (current approach)
rpm size: 35186079102 disk size: 177913332940
- debuginfo with -fdebug-types-section
rpm size: 37570327765 disk size: 214927514757
= DWZ rpm size is smaller by 6.78% = DWZ on-disk size is smaller by 20.8%
It is based on 22080 Fedora Rawhide packages rebuilt on 2020-08-24.
It is based on 22080 Fedora Rawhide packages (20637 rebuilt) on 2020-08-24.
Jan
== Benefit to Fedora ==
- Better compatibility with existing debugging and tracing tools,
primarily [https://lldb.llvm.org/ LLDB].
Thanks for your work on this Ben and Jan, Just as an interested user, use of the DWZ format significantly limits Swift development on Fedora, as it is impossible to debug with LLDB when using system libraries.
Ryan
On Fri, Sep 25, 2020 at 7:26 AM Ryan ryan@testtoast.com wrote:
== Benefit to Fedora ==
- Better compatibility with existing debugging and tracing tools,
primarily [https://lldb.llvm.org/ LLDB].
Thanks for your work on this Ben and Jan, Just as an interested user, use of the DWZ format significantly limits Swift development on Fedora, as it is impossible to debug with LLDB when using system libraries.
But that's fixable since there's a patchset to make LLDB understand dwz, which was not submitted upstream for unstated reasons.
On Fri, 25 Sep 2020 13:43:40 +0200, Neal Gompa wrote:
On Fri, Sep 25, 2020 at 7:26 AM Ryan ryan@testtoast.com wrote:
== Benefit to Fedora ==
- Better compatibility with existing debugging and tracing tools,
primarily [https://lldb.llvm.org/ LLDB].
Thanks for your work on this Ben and Jan, Just as an interested user, use of the DWZ format significantly limits Swift development on Fedora, as it is impossible to debug with LLDB when using system libraries.
Currently you can use my off-trunk patchset build: dnf copr enable jankratochvil/lldb;dnf install lldb-experimental;lldb-experimental
But that's fixable since there's a patchset to make LLDB understand dwz, which was not submitted upstream for unstated reasons.
The reasons are stated in other mails, re-stating here: * the support of DWZ is complicated for effective DWARF consumers like LLDB * the same DWZ size reduction can be achieved just by removing dead DIEs and -fdebug-types-section skeletons - not needing any DWZ nor DWZ support in consumers at all * the LLDB DWZ support is now scatterd across the LLDB DWARF codebase for every DIE type so it will require continuous maintenance during LLDB development from RH; Google+Apple will probably never use DWZ themselves. * the LLDB DWZ support is currently implemented in LLDB DWARF/ but LLDB is going to replace its DWARF code to clang's DWARF code which will then require rewriting the LLDB DWZ code from scratch
Jan
Thanks Jan,
I had subsequently discovered your COPR, which does work with the DWZ symbols and allow debugging, however your version is missing Swift support, and so doesn't support Swift function name demangling and variable display etc.
+1 for moving to -fdebug-types-section anyway.
Regards,
Ryan
On Sat, 26 Sep 2020, at 1:23 AM, Jan Kratochvil wrote:
On Fri, 25 Sep 2020 13:43:40 +0200, Neal Gompa wrote:
On Fri, Sep 25, 2020 at 7:26 AM Ryan ryan@testtoast.com wrote:
== Benefit to Fedora ==
- Better compatibility with existing debugging and tracing tools,
primarily [https://lldb.llvm.org/ LLDB].
Thanks for your work on this Ben and Jan, Just as an interested user, use of the DWZ format significantly limits Swift development on Fedora, as it is impossible to debug with LLDB when using system libraries.
Currently you can use my off-trunk patchset build: dnf copr enable jankratochvil/lldb;dnf install lldb-experimental;lldb-experimental
But that's fixable since there's a patchset to make LLDB understand dwz, which was not submitted upstream for unstated reasons.
The reasons are stated in other mails, re-stating here:
- the support of DWZ is complicated for effective DWARF consumers like LLDB
- the same DWZ size reduction can be achieved just by removing dead DIEs and -fdebug-types-section skeletons - not needing any DWZ nor DWZ support in consumers at all
- the LLDB DWZ support is now scatterd across the LLDB DWARF codebase for every DIE type so it will require continuous maintenance during LLDB development from RH; Google+Apple will probably never use DWZ themselves.
- the LLDB DWZ support is currently implemented in LLDB DWARF/ but LLDB is going to replace its DWARF code to clang's DWARF code which will then require rewriting the LLDB DWZ code from scratch
Jan
On Thu, Sep 24, 2020 at 11:59:44AM -0400, Ben Cotton wrote:
https://fedoraproject.org/wiki/Changes/DebugInfoStandardization
== Summary == Fedora 18 implemented [[Features/DwarfCompressor]]. As the format did not get widespread and the tool is not much maintained it became burden to make existing debugging tools compatible with Fedora debug info.
I'd like to state that there is no agreement about this in the toolchain team, Jan had probably problems getting his DWZ support upstreamed into LLDB and decided to spend his time instead non-constructively trying to kill DWZ.
-fdebug-types-section in GCC is pretty much unmaintained, almost nobody is really using it, and the debug types design is quite flawed and shouldn't have been added into DWARF. What DWZ implements is what DWARF3 and onwards have been documenting as DWARF compression technique, before DWARF5 for the multi-file it has been using an extension but that is now standardized in DWARF5. Mark is actively working on DWARF5 support for dwz right now, and furthermore what -fdebug-types-section does is not the same thing in principle as what DWZ does. DWZ primarily just removes redundancies, when the DIEs are the same, and doesn't really matter what kind of DIEs it is, while -fdebug-types-section is about types only, and does significantly change what the debug info contains and how it can be referenced. Furthermore, while DWZ currently does not support -fdebug-types-section, it isn't principally incompatible with it, just I didn't want to waste time on something I saw as broken by design. DWZ in theory could handle what -fdebug-types-section produces and undo the significant size overhead it adds due to the large references (basically could undo the type units and turn them back into normal .debug_info).
So, from my side, strong objection against this proposal.
Jakub
On Fri, 25 Sep 2020 13:56:47 +0200, Jakub Jelinek wrote:
Jan had probably problems getting his DWZ support upstreamed into LLDB
It isn't completely easy, I have already upstreamed a lot of preparatory work for the DWZ patchset which is good for LLDB in general.
Just before upstreaming the final DWZ-specific part I am trying to provide a better solution for Fedora as the numbers I have benchmarked do not speak for the overcomplicated DWZ solution.
and decided to spend his time instead non-constructively trying to kill DWZ.
Simplifying Fedora toolchains is more constructive than supporting complicated format with no benefits.
-fdebug-types-section in GCC is pretty much unmaintained, almost nobody is really using it,
According to Richard Biener from GCC -fdebug-types-section is a normally supported GCC feature: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88878#c6
and the debug types design is quite flawed and shouldn't have been added into DWARF.
So why is Google using it for everything? I have this opinion about DWZ.
What DWZ implements is what DWARF3 and onwards have been documenting as DWARF compression technique, before DWARF5 for the multi-file it has been using an extension but that is now standardized in DWARF5.
It is specified by the standard but without .debug_names support which is essential for effective DWARF consumers. Also not everything that got accepted into the DWARF standard is necessarily a good thing. Besides that DWZ could have better size reduction if: * it did not contain so many bugs it fails on many files * it did not give up on larger debuginfos due to running out of memory: it could implement slower alternative algorithms with on-disk temporary files
Mark is actively working on DWARF5 support for dwz right now,
Mark Wielaard does not plan to support DWARF-5 features being used by LLVM.
DWZ primarily just removes redundancies, when the DIEs are the same, and doesn't really matter what kind of DIEs it is, while -fdebug-types-section is about types only, and does significantly change what the debug info contains and how it can be referenced.
DWZ's only possible benefit is the DWZ common file. Without DWZ common file for reasons listed above DWZ produces 1.6% bigger debuginfo than -fdebug-types-section while requiring 8x more complicated consumers.
And for example for downloading of separate *.debug files DWZ is even bigger than -fdebug-types-section exactly because of the DWZ common files which are a double-edged sword.
Besides that it is all about few percents of size nobody cares about.
DWZ in theory could handle what -fdebug-types-section produces and undo the significant size overhead it adds due to the large references (basically could undo the type units and turn them back into normal .debug_info).
Those "large references" are already considered in the numbers I provided. If one removes the "large references" together with dead DIES (address zero) one has the same size decrease as DWZ without any DWZ needed.
Jan
Jan Kratochvil jan.kratochvil@redhat.com writes:
So why is Google using it for everything?
If I could eliminate one bad thought pattern in software design it would probably be this one.
In brief: you are not Google, nor are you Facebook, nor Amazon. Your problems are not their problems. Your use case is not their use case. Plenty of things work great for them that will work terribly for you.
So saying "Google does it" (or similar) is *not* a good argument.
Thanks, --Robbie
* Robbie Harwood:
Jan Kratochvil jan.kratochvil@redhat.com writes:
So why is Google using it for everything?
If I could eliminate one bad thought pattern in software design it would probably be this one.
In brief: you are not Google, nor are you Facebook, nor Amazon. Your problems are not their problems. Your use case is not their use case. Plenty of things work great for them that will work terribly for you.
So saying "Google does it" (or similar) is *not* a good argument.
Agreed, especially since we know that e.g. Google's use of C++ does not align well with how many other programmers use the language.
Thanks, Florian
Hi,
On Fri, 2020-09-25 at 17:18 +0200, Florian Weimer wrote:
- Robbie Harwood:
Jan Kratochvil jan.kratochvil@redhat.com writes:
So why is Google using it for everything?
If I could eliminate one bad thought pattern in software design it would probably be this one.
In brief: you are not Google, nor are you Facebook, nor Amazon. Your problems are not their problems. Your use case is not their use case. Plenty of things work great for them that will work terribly for you.
So saying "Google does it" (or similar) is *not* a good argument.
Agreed, especially since we know that e.g. Google's use of C++ does not align well with how many other programmers use the language.
The Google engineers responsible for their internal build system don't make it a secret. They use debug-types combined with [out of .o file] split-dwarf[=split]. But they also admit that it is for a specialized use case that might only makes sense if you have a central build system that farms out different parts of the build/compile/link steps to different machines.
(for Google) - a distributed build system that is trying to avoid moving more bytes than it must to one machine to run the link step. So not having to ship all the DWARF bytes to one machine for interactive debugging (pulling down from a distributed file system only the needed .dwo files during debugging - not all of them) - or at least being able to ship all the .dwo files to one machine to make a .dwp, and ship all the .o files to another machine for the link.
It is certainly a clever setup and makes sense if your build bottleneck is sending files around between different machines. But I don't think this is the generic Fedora packager or developer use case.
Cheers,
Mark
On Mon, 28 Sep 2020 14:08:48 +0200, Mark Wielaard wrote:
It is certainly a clever setup and makes sense if your build bottleneck is sending files around between different machines. But I don't think this is the generic Fedora packager or developer use case.
I agree and I do not propose anywhere -gsplit-dwarf. That is offtopic to this mail thread and it may look as it is related to my -fdebug-types-section proposal.
It would make sense possibly only for Chromium which has no debuginfo in Fedora currently at all. And the missing debuginfo is due to DWZ because DWZ does not support -fdebug-types-section, nobody is supporting DWARF64 and Chromium .debug_info section without -fdebug-types-section would be larger than 4GB, therefore technically impossible with DWARF32.
I am saying -gsplit-dwarf is probably the best solution despite right now -fdebug-types-section is the best (smallest possible file) solution. As with -gdwarf-5 -fdebug-types-section its .debug_info section is very close to 4GB and it will exceed in some time anyway. Then only -gsplit-dwarf will be possible with DWARF32 && DWARF-5.
Jan
* Jan Kratochvil:
On Fri, 25 Sep 2020 17:09:54 +0200, Robbie Harwood wrote:
So saying "Google does it" (or similar) is *not* a good argument.
So let's stick only to the numbers I sent in other mails. In fact I do not understand why we talk about anything except the numbers.
The numbers are very difficult understand because it's not clear what you are measuring. Especially since as far I understand it, parts are not yet fully implemented, so we can't know yet if all the required data is there.
Thanks, Florian
On Fri, 25 Sep 2020 17:40:50 +0200, Florian Weimer wrote:
The numbers are very difficult understand because it's not clear what you are measuring. Especially since as far I understand it, parts are not yet fully implemented, so we can't know yet if all the required data is there.
TL;DR nobody cares about 3% of distribution size (for Fedora; for CentOS 0%).
If really someone does then I can just post-process the DWARF files to drop dead DIEs and the size will be the same as with DWZ without any need of DWZ.
Jan