Hi,
As discussed few weeks ago, I'm working on reproducible builds for Fedora. I've submitted a request for review for new packages: https://bugzilla.redhat.com/show_bug.cgi?id=1924918. Notably, reprotest is a striking tool to test reproduciblity by changing multiples build factors (time, user, lang, etc.) and highlight differences (if exists) with diffoscope (see https://salsa.debian.org/reproducible-builds/reprotest).
On the same topic, I'm developing rpmreproduce (see https://github.com/fepitre/rpmreproduce) which is very much work in progress. This tool allows to rebuild a RPM with the same environment, packages versions etc. This is in the continuity of a previous attempt https://github.com/kholia/ReproducibleBuilds. Currently, it uses a "buildinfo" file as input (see https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles) but there is not such file in Fedora (yet?). In Qubes OS, we use an original implementation for RPM done at the occasion of Reproducible Builds summit: https://github.com/QubesOS/qubes-builder-rpm/blob/master/scripts/rpmbuildinf... or https://raw.githubusercontent.com/fepitre/rpmreproduce/master/scripts/rpmbui... (latest dev/test version). This tool is in charge to download exact version dependencies as specified in the buildinfo, create a local repository, download the corresponding source RPM and then, rebuild it with mock and only this locally created repository that reflects the original build environment.
I take this opportunity to invite RPM devs to discuss about a possible upstream implementation of buildinfo file format. For example, we could think about having a buildinfo file automatically generated by rpmbuild as dpkg is doing similarly in Debian. I would be happy to do the work for that.
Best regards, Frédéric
On Wed, Feb 03, 2021 at 10:50:43PM +0100, Frédéric Pierret wrote:
Hi,
As discussed few weeks ago, I'm working on reproducible builds for Fedora.
...snip...
I'll try and take a look at the tools mentioned when I get a chance, but I wanted to just thank you for working on this. :)
So, thanks!
kevin
On Wed, Feb 3, 2021 at 4:51 PM Frédéric Pierret frederic.pierret@qubes-os.org wrote:
Hi,
As discussed few weeks ago, I'm working on reproducible builds for Fedora. I've submitted a request for review for new packages: https://bugzilla.redhat.com/show_bug.cgi?id=1924918. Notably, reprotest is a striking tool to test reproduciblity by changing multiples build factors (time, user, lang, etc.) and highlight differences (if exists) with diffoscope (see https://salsa.debian.org/reproducible-builds/reprotest).
On the same topic, I'm developing rpmreproduce (see https://github.com/fepitre/rpmreproduce) which is very much work in progress. This tool allows to rebuild a RPM with the same environment, packages versions etc. This is in the continuity of a previous attempt https://github.com/kholia/ReproducibleBuilds. Currently, it uses a "buildinfo" file as input (see https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles) but there is not such file in Fedora (yet?). In Qubes OS, we use an original implementation for RPM done at the occasion of Reproducible Builds summit: https://github.com/QubesOS/qubes-builder-rpm/blob/master/scripts/rpmbuildinf... or https://raw.githubusercontent.com/fepitre/rpmreproduce/master/scripts/rpmbui... (latest dev/test version). This tool is in charge to download exact version dependencies as specified in the buildinfo, create a local repository, download the corresponding source RPM and then, rebuild it with mock and only this locally created repository that reflects the original build environment.
I take this opportunity to invite RPM devs to discuss about a possible upstream implementation of buildinfo file format. For example, we could think about having a buildinfo file automatically generated by rpmbuild as dpkg is doing similarly in Debian. I would be happy to do the work for that.
The Koji build system already records buildinfo data in a slightly different form, where build IDs are linked to all the inputs that constructed the build environment as recorded by Koji itself. This implicitly includes a definition of all the RPM macros that are inputs for a build, too.
I would generally expect that information like this at the rpmbuild level should probably be stored in the Source RPM. Since a Source RPM is an atomic unit containing a build description and the inputs needed to make the build work, it would make sense that more comprehensive build environment data would go there. Source RPMs already contain some rudimentary stuff, like the compiler build flags set in the build environment during the package build time. It would make sense to expand this to cover all inputs traditionally covered by the buildinfo file in Debian.
On Wed, Feb 03, 2021 at 10:41:30PM -0500, Neal Gompa wrote:
On Wed, Feb 3, 2021 at 4:51 PM Frédéric Pierret frederic.pierret@qubes-os.org wrote:
Hi,
As discussed few weeks ago, I'm working on reproducible builds for Fedora. I've submitted a request for review for new packages: https://bugzilla.redhat.com/show_bug.cgi?id=1924918. Notably, reprotest is a striking tool to test reproduciblity by changing multiples build factors (time, user, lang, etc.) and highlight differences (if exists) with diffoscope (see https://salsa.debian.org/reproducible-builds/reprotest).
On the same topic, I'm developing rpmreproduce (see https://github.com/fepitre/rpmreproduce) which is very much work in progress. This tool allows to rebuild a RPM with the same environment, packages versions etc. This is in the continuity of a previous attempt https://github.com/kholia/ReproducibleBuilds. Currently, it uses a "buildinfo" file as input (see https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles) but there is not such file in Fedora (yet?). In Qubes OS, we use an original implementation for RPM done at the occasion of Reproducible Builds summit: https://github.com/QubesOS/qubes-builder-rpm/blob/master/scripts/rpmbuildinf... or https://raw.githubusercontent.com/fepitre/rpmreproduce/master/scripts/rpmbui... (latest dev/test version). This tool is in charge to download exact version dependencies as specified in the buildinfo, create a local repository, download the corresponding source RPM and then, rebuild it with mock and only this locally created repository that reflects the original build environment.
I take this opportunity to invite RPM devs to discuss about a possible upstream implementation of buildinfo file format. For example, we could think about having a buildinfo file automatically generated by rpmbuild as dpkg is doing similarly in Debian. I would be happy to do the work for that.
The Koji build system already records buildinfo data in a slightly different form, where build IDs are linked to all the inputs that constructed the build environment as recorded by Koji itself. This implicitly includes a definition of all the RPM macros that are inputs for a build, too.
I would generally expect that information like this at the rpmbuild level should probably be stored in the Source RPM. Since a Source RPM is an atomic unit containing a build description and the inputs needed to make the build work, it would make sense that more comprehensive build environment data would go there. Source RPMs already contain some rudimentary stuff, like the compiler build flags set in the build environment during the package build time. It would make sense to expand this to cover all inputs traditionally covered by the buildinfo file in Debian.
Isn't it going to be an issue that the initial SRPM can't possibly have this info? Buildinfo generally is a product of a build, similar to build log, just more structured, which can be used as a build input in some cases too (reproducing particular binary rpm). Copying from Debian wiki:
The .buildinfo file has several goals which are related to each other:
* It records information about the system environment used during a particular build -- packages installed (toolchain, etc), system architecture, etc. This can be useful for forensics/debugging. * It can also be used to try to recreate (partially or in full) the system environment when trying to reproduce a particular build.
It makes a perfect sense to take a single SRPM and build in two different environments (for example for f33 and f34) - resulting in two different sets of binary RPMs and two buildinfo files (wherever they will be).
So, if including in an RPM, I'd say it is more logical to include in a binary RPM - a build output. In fact, Archlinux does exactly that (in their package format). If it would be in an SRPM, then you'd need to rebuild/modify SRPM _after_ building binary RPMs, which feels wrong...
Does it make sense?
On Fri, Feb 05, 2021 at 12:17:28AM +0100, Marek Marczykowski-Górecki wrote:
Does it make sense?
That does make sense to me... and perhaps this fits in with that we generate debuginfo/debugsource rpms when we build something. We just expand things to also produce a buildinfo subpackage (of course then we need tools to gather them/put them in a repo/allow users to install them, etc).
Then, you could 'dnf install foobar-buildinfo-1.0-1.fc35' and possibly there could be tools that would read that .buildinfo and feed the src.rpm into mock or whatever with that input?
kevin
On Thu, Feb 4, 2021 at 9:23 PM Kevin Fenzi kevin@scrye.com wrote:
On Fri, Feb 05, 2021 at 12:17:28AM +0100, Marek Marczykowski-Górecki wrote:
Does it make sense?
That does make sense to me... and perhaps this fits in with that we generate debuginfo/debugsource rpms when we build something. We just expand things to also produce a buildinfo subpackage (of course then we need tools to gather them/put them in a repo/allow users to install them, etc).
Then, you could 'dnf install foobar-buildinfo-1.0-1.fc35' and possibly there could be tools that would read that .buildinfo and feed the src.rpm into mock or whatever with that input?
That is, of course, another valid strategy, and may make sense given the archful nature of some things.
On Thu, Feb 04, 2021 at 10:56:43PM -0500, Neal Gompa wrote:
On Thu, Feb 4, 2021 at 9:23 PM Kevin Fenzi kevin@scrye.com wrote:
On Fri, Feb 05, 2021 at 12:17:28AM +0100, Marek Marczykowski-Górecki wrote:
Does it make sense?
That does make sense to me... and perhaps this fits in with that we generate debuginfo/debugsource rpms when we build something. We just expand things to also produce a buildinfo subpackage (of course then we need tools to gather them/put them in a repo/allow users to install them, etc).
Can you expand on that last part? Are you referring to some automation that pulls debuginfo rpms into a separate repository? I guess the buildinfo one should go into sources repo, right?
Then, you could 'dnf install foobar-buildinfo-1.0-1.fc35' and possibly there could be tools that would read that .buildinfo and feed the src.rpm into mock or whatever with that input?
That is, of course, another valid strategy, and may make sense given the archful nature of some things.
New sub-package sounds like a good idea. Is it ok to package it as a single file in /usr/src/buildinfo?
On Fri, Feb 5, 2021 at 7:03 AM Marek Marczykowski-Górecki marmarek@invisiblethingslab.com wrote:
On Thu, Feb 04, 2021 at 10:56:43PM -0500, Neal Gompa wrote:
On Thu, Feb 4, 2021 at 9:23 PM Kevin Fenzi kevin@scrye.com wrote:
On Fri, Feb 05, 2021 at 12:17:28AM +0100, Marek Marczykowski-Górecki wrote:
Does it make sense?
That does make sense to me... and perhaps this fits in with that we generate debuginfo/debugsource rpms when we build something. We just expand things to also produce a buildinfo subpackage (of course then we need tools to gather them/put them in a repo/allow users to install them, etc).
Can you expand on that last part? Are you referring to some automation that pulls debuginfo rpms into a separate repository? I guess the buildinfo one should go into sources repo, right?
Then, you could 'dnf install foobar-buildinfo-1.0-1.fc35' and possibly there could be tools that would read that .buildinfo and feed the src.rpm into mock or whatever with that input?
That is, of course, another valid strategy, and may make sense given the archful nature of some things.
New sub-package sounds like a good idea. Is it ok to package it as a single file in /usr/src/buildinfo?
Sure, we'd probably have buildinfo packages in a separate repository like we do debuginfo packages. Most people will never use them.
The advantage of -buildinfo packages is that we can start with a buildinfo file and actually add more as we want to have more complete build records, including macros, environment variables, etc.
-- 真実はいつも一つ!/ Always, there's only one truth!
On Fri, Feb 05, 2021 at 08:10:28AM -0500, Neal Gompa wrote:
On Fri, Feb 5, 2021 at 7:03 AM Marek Marczykowski-Górecki marmarek@invisiblethingslab.com wrote:
On Thu, Feb 04, 2021 at 10:56:43PM -0500, Neal Gompa wrote:
On Thu, Feb 4, 2021 at 9:23 PM Kevin Fenzi kevin@scrye.com wrote:
On Fri, Feb 05, 2021 at 12:17:28AM +0100, Marek Marczykowski-Górecki wrote:
Does it make sense?
That does make sense to me... and perhaps this fits in with that we generate debuginfo/debugsource rpms when we build something. We just expand things to also produce a buildinfo subpackage (of course then we need tools to gather them/put them in a repo/allow users to install them, etc).
Can you expand on that last part? Are you referring to some automation that pulls debuginfo rpms into a separate repository? I guess the buildinfo one should go into sources repo, right?
Then, you could 'dnf install foobar-buildinfo-1.0-1.fc35' and possibly there could be tools that would read that .buildinfo and feed the src.rpm into mock or whatever with that input?
That is, of course, another valid strategy, and may make sense given the archful nature of some things.
New sub-package sounds like a good idea. Is it ok to package it as a single file in /usr/src/buildinfo?
Sure, we'd probably have buildinfo packages in a separate repository like we do debuginfo packages. Most people will never use them.
The advantage of -buildinfo packages is that we can start with a buildinfo file and actually add more as we want to have more complete build records, including macros, environment variables, etc.
Is it really worth creating a package for this? The payload could just as well be implemented as a json file. Wrapping this in a separate subpackage seems like unnecessary overhead. If we want to expose it in a way that allows installing using dnf, maybe we could just stash it in the debuginfo package? debuginfo packages are rather big already, so the extra file wouldn't really matter much.
Zbyszek
On Fri, Feb 05, 2021 at 01:47:00PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
Is it really worth creating a package for this? The payload could just as well be implemented as a json file. Wrapping this in a separate subpackage seems like unnecessary overhead. If we want to expose it in a way that allows installing using dnf, maybe we could just stash it in the debuginfo package? debuginfo packages are rather big already, so the extra file wouldn't really matter much.
I was thinking the same thing. We don't have a lot of resources for new infrastructure projects and debuginfo already exists and seems closely-related-enough.
On Fri, Feb 05, 2021 at 01:47:00PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
On Fri, Feb 05, 2021 at 08:10:28AM -0500, Neal Gompa wrote:
On Fri, Feb 5, 2021 at 7:03 AM Marek Marczykowski-Górecki marmarek@invisiblethingslab.com wrote:
On Thu, Feb 04, 2021 at 10:56:43PM -0500, Neal Gompa wrote:
On Thu, Feb 4, 2021 at 9:23 PM Kevin Fenzi kevin@scrye.com wrote:
On Fri, Feb 05, 2021 at 12:17:28AM +0100, Marek Marczykowski-Górecki wrote:
Does it make sense?
That does make sense to me... and perhaps this fits in with that we generate debuginfo/debugsource rpms when we build something. We just expand things to also produce a buildinfo subpackage (of course then we need tools to gather them/put them in a repo/allow users to install them, etc).
Can you expand on that last part? Are you referring to some automation that pulls debuginfo rpms into a separate repository? I guess the buildinfo one should go into sources repo, right?
Then, you could 'dnf install foobar-buildinfo-1.0-1.fc35' and possibly there could be tools that would read that .buildinfo and feed the src.rpm into mock or whatever with that input?
That is, of course, another valid strategy, and may make sense given the archful nature of some things.
New sub-package sounds like a good idea. Is it ok to package it as a single file in /usr/src/buildinfo?
Sure, we'd probably have buildinfo packages in a separate repository like we do debuginfo packages. Most people will never use them.
The advantage of -buildinfo packages is that we can start with a buildinfo file and actually add more as we want to have more complete build records, including macros, environment variables, etc.
Is it really worth creating a package for this? The payload could just as well be implemented as a json file. Wrapping this in a separate subpackage seems like unnecessary overhead. If we want to expose it in a way that allows installing using dnf, maybe we could just stash it in the debuginfo package? debuginfo packages are rather big already, so the extra file wouldn't really matter much.
The size of debuginfo packages depends on your POV.
If you're already pulling in debuginfo packages, then the buildinfo won't make them bigger so it is not a concern on that side.
If you're trying to consume the buildinfo data though, it feels pretty suboptimal to have to pull in the enourmous debuginfo RPM just to get access to a tiny piece of build info data.
Regards, Daniel
On Fri, Feb 05, 2021 at 02:21:32PM +0000, Daniel P. Berrangé wrote:
If you're trying to consume the buildinfo data though, it feels pretty suboptimal to have to pull in the enourmous debuginfo RPM just to get access to a tiny piece of build info data.
"Suboptimal" is a great word here. Because: is this a workflow hotspot for the distribution in general? I don't think so. It's a valuable special case, but one 99.999% of users won't use directly, and the few people who do won't be doing it all the time. It's a perfectly good situation to _not_ optimize.
Le 2/5/21 à 3:30 PM, Matthew Miller a écrit :
On Fri, Feb 05, 2021 at 02:21:32PM +0000, Daniel P. Berrangé wrote:
If you're trying to consume the buildinfo data though, it feels pretty suboptimal to have to pull in the enourmous debuginfo RPM just to get access to a tiny piece of build info data.
"Suboptimal" is a great word here. Because: is this a workflow hotspot for the distribution in general? I don't think so. It's a valuable special case, but one 99.999% of users won't use directly, and the few people who do won't be doing it all the time. It's a perfectly good situation to _not_ optimize.
That was my original remark that pulling a "big" RPM for just one file is not optimal. But as you said, it will probably concern very few amount of users. So inserting a buildinfo file into debuginfo RPM is a good start. Also, nothing is written in the stone? If needed in a future, we could change that.
With all the current iteration, should I assume that a first approach has converged to use debuginfo RPM for storing buildinfo file?
Thank you all for your time on this subject.
Best, Frédéric
On Fri, Feb 05, 2021 at 03:55:33PM +0100, Frédéric Pierret wrote:
be doing it all the time. It's a perfectly good situation to _not_ optimize.
That was my original remark that pulling a "big" RPM for just one file is not optimal. But as you said, it will probably concern very few amount of users. So inserting a buildinfo file into debuginfo RPM is a good start. Also, nothing is written in the stone? If needed in a future, we could change that.
Yeah, exactly — we can optimize later.
On Fri, Feb 05, 2021 at 10:07:33AM -0500, Matthew Miller wrote:
On Fri, Feb 05, 2021 at 03:55:33PM +0100, Frédéric Pierret wrote:
be doing it all the time. It's a perfectly good situation to _not_ optimize.
That was my original remark that pulling a "big" RPM for just one file is not optimal. But as you said, it will probably concern very few amount of users. So inserting a buildinfo file into debuginfo RPM is a good start. Also, nothing is written in the stone? If needed in a future, we could change that.
Yeah, exactly — we can optimize later.
Sounds like a nice idea, but we only make debuginfo packages for packages that have debug symbols to strip out. Where would we put it for all the packages that don't have debuginfo now? Just make one? Thats could be a bit confusing. (why does this noarch script have a debuginfo file)?
I think it might be better to do a new buildinfo subpackage, but just never distribute it (for now it just exists in koji/local builds). Then, once it someday becomes of use we could start shipping it somehow.
But all this is getting a bit ahead. Someone needs to come up with the contents and tools to make/read/do cool things with them first. :)
kevin
On Fri, Feb 05, 2021 at 11:18:40AM -0800, Kevin Fenzi wrote:
On Fri, Feb 05, 2021 at 10:07:33AM -0500, Matthew Miller wrote:
On Fri, Feb 05, 2021 at 03:55:33PM +0100, Frédéric Pierret wrote:
be doing it all the time. It's a perfectly good situation to _not_ optimize.
That was my original remark that pulling a "big" RPM for just one file is not optimal. But as you said, it will probably concern very few amount of users. So inserting a buildinfo file into debuginfo RPM is a good start. Also, nothing is written in the stone? If needed in a future, we could change that.
Yeah, exactly — we can optimize later.
Sounds like a nice idea, but we only make debuginfo packages for packages that have debug symbols to strip out. Where would we put it for all the packages that don't have debuginfo now? Just make one? Thats could be a bit confusing. (why does this noarch script have a debuginfo file)?
Ah, good point.
I think it might be better to do a new buildinfo subpackage, but just never distribute it (for now it just exists in koji/local builds). Then, once it someday becomes of use we could start shipping it somehow.
I think the -buildinfo (would that name be ok?) subpackage can be included in the -source repository (if that's compatible with different packages for different archs). Alternatively, -debuginfo repo, but that feels weird.
But all this is getting a bit ahead. Someone needs to come up with the contents and tools to make/read/do cool things with them first. :)
There is one in progress already: https://github.com/fepitre/rpmreproduce
:)
On Wed, Feb 3, 2021 at 8:42 PM Neal Gompa ngompa13@gmail.com wrote:
The Koji build system already records buildinfo data in a slightly different form, where build IDs are linked to all the inputs that constructed the build environment as recorded by Koji itself. This implicitly includes a definition of all the RPM macros that are inputs for a build, too.
There is a presentation about Koji and the reproducible-builds.org effort at https://www.youtube.com/watch?v=wxzGdX5iMgw