On Thu, Oct 28, 2021 at 09:30:18PM +0200, Lennart Poettering wrote:
I was thinking more about this proposal over the past weekend and
where I keep ending up is that this is really optimizing for a small
use case by touching ELF metadata all over the system. And that
strikes me as pretty invasive, so is it worth the tradeoffs and risks
and such?
Well, whether it's small or not is subjective, I think. CBL-Mariner was the first
distro to add support for this, and I can tell you with certainty that it's not small
at all for the internal use cases where this is employed.
I mean build-ids too, if everything goes well and nothing ever crashes or needs debugged,
are never needed. So one could say that even adding that to the ELF header is touching
everything for a small use case. But I don't think it would be fair, because even if
most of the times things are just working and you don't need to debug anything,
it's when things start to go very wrong that you want to have as much help and as many
tools at your disposal as possible.
Even if it was a small use case, support engineers and maintainers have such a hard job
already, it seems to me it would be worth making some of it easier if the cost is just a
few KBs in disk space in the typical scenario.
Debugging is a pain, and anything to make that easier is better. It
has been stated multiple times that the information needs to be in the
ELF header because containers and images may lack an RPM database.
Fair, but what about the users that both want a container and image
without the RPM database and systemd-coredump? They still have all of
their ELF files with this information that they removed in other ways.
Do we provide those users with a script to strip .gnu.notes from
everything or is that even a use case of concern?
Efforts to get the system very small for container and image use has
been a goal for a while. And sure we're not talking about a lot of
data, but that's now. The size of everything only grows, so is that
something to consider with the implementation of this feature?
Note that systemd-coredump is on the host, not in the container. This is one of the points
of the proposal, and mentioned in the wiki: core files are collected and analyzed on the
host, not in container guests, so one of the reasons an rpm/dpkg/whatever database
won't help is because they don't necessarily match. So users don't have to
include systemd-coredump in their container images at all, before or after this proposal.
Moreover, the difference in size in a typical small container between an rpm/dpkg/whatever
database and the overhead of these notes is several orders of magnitudes - several
megabytes vs several kilobytes. Again this comparison is done on the wiki:
https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects...
Finally, if one is really convinced that they don't need this and want to strip it
out, there's not even the need for a script, objcopy supports removing notes out of
the box with objcopy --remove-section.
Another thing I thought about were reproducible builds. Does this
impact reproducible builds and if so, how do we handle that?
It does not impact that, this is covered on the wiki:
https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects...
I would feel more comfortable with this proposal if the data for
systemd-coredump was not part of the ELF metadata. Or if it
absolutely must be part of the ELF metadata, users should know how it
can be removed. I would also vote for a format other than JSON, but
that's just me.
At MSFT we tried several options, but as far as we could tell the only way to be 100%
certain that the metadata is always included automatically in the corefile if present in
the binary is by using an allocated ELF PT_NOTE. Note that systemd-coredump is just one of
the possible consumers - again at MSFT we have other internal tooling consuming it as
well.