I want to be sure, for license compliance, that all the binary bits on the final LiveCD have corresponding source code available.
One of the "features" I'd like to see something in the stack of livecd-tools produce is a CD/DVD/whatever of the SRPMS that match the RPMs that go into the LiveCD. Smooge and I have both done this ourselves, with varying degrees of ease, essentially querying all the installed RPMs on the LiveCD after-the-fact and generating the list, then grabbing the files etc. All very manual. I expect there's a better way, and I'm even open to helping code it, but am looking for direction from you - those who know the tools best...
Maybe it's really an Anaconda feature?
Advise please.
Thanks, Matt
Matt Domsch wrote:
I want to be sure, for license compliance, that all the binary bits on the final LiveCD have corresponding source code available.
One of the "features" I'd like to see something in the stack of livecd-tools produce is a CD/DVD/whatever of the SRPMS that match the RPMs that go into the LiveCD. Smooge and I have both done this ourselves, with varying degrees of ease, essentially querying all the installed RPMs on the LiveCD after-the-fact and generating the list, then grabbing the files etc. All very manual. I expect there's a better way, and I'm even open to helping code it, but am looking for direction from you - those who know the tools best...
A more 'during the fact' approach would be to do an rpm -qa at the end of your %post. Then maybe still in the %post, iterate over that list with rpm -qi, looking at the src rpm entry.
I notice yum has disabled by default source repos. I can't immediately see how to use them, but perhaps you could then further take the list above, and still in the %post do something to query the source repos via yum, perhaps temporarily downloading the src rpm, generating sha1sum, such that the resulting livecd includes a list of sha1sums for every src rpm. Then it would be easy enough to after the fact, extract that list from the livecd, and pull a collection of the src rpms.
Just some thoughts... Perhaps somebody can tell me why the source repos are there in the default yum config, and what can actually be done with them via yum.
-dmc
Douglas McClendon wrote:
Matt Domsch wrote:
I want to be sure, for license compliance, that all the binary bits on the final LiveCD have corresponding source code available.
One of the "features" I'd like to see something in the stack of livecd-tools produce is a CD/DVD/whatever of the SRPMS that match the RPMs that go into the LiveCD. Smooge and I have both done this ourselves, with varying degrees of ease, essentially querying all the installed RPMs on the LiveCD after-the-fact and generating the list, then grabbing the files etc. All very manual. I expect there's a better way, and I'm even open to helping code it, but am looking for direction from you - those who know the tools best...
A more 'during the fact' approach would be to do an rpm -qa at the end of your %post. Then maybe still in the %post, iterate over that list with rpm -qi, looking at the src rpm entry.
I notice yum has disabled by default source repos. I can't immediately see how to use them, but perhaps you could then further take the list above, and still in the %post do something to query the source repos via yum, perhaps temporarily downloading the src rpm, generating sha1sum, such that the resulting livecd includes a list of sha1sums for every src rpm. Then it would be easy enough to after the fact, extract that list from the livecd, and pull a collection of the src rpms.
Just some thoughts... Perhaps somebody can tell me why the source repos are there in the default yum config, and what can actually be done with them via yum.
Ok, so I found yum-utils. So you could iterate over your list of src rpms above, with yumdownloader --source, and yum-builddep to really make sure that you get all the source needed to rebuild the livecd.
Of course the true end-all, would be to then create an alternate version of your livecd, which included all of those things (and presumably some list of binary rpms needed for bootstrapping), as well as livecd-tools, in a /rebuild directory, such that this alternate live-bluray was completely capable of self-hosted-recompilation/production from included source (preferably from a simple script/program executable from the booted live-bluray, that requires no user interaction if it can find a suitably large tmpdir, and also no root privileges, if my aforementioned qrr program exists and is fully integrated)
That seems to me to be the most elegant way to distribute a gpl distribution :)
-dmc
On Mon, 2007-08-13 at 22:11 -0500, Matt Domsch wrote:
I want to be sure, for license compliance, that all the binary bits on the final LiveCD have corresponding source code available.
One of the "features" I'd like to see something in the stack of livecd-tools produce is a CD/DVD/whatever of the SRPMS that match the RPMs that go into the LiveCD. Smooge and I have both done this ourselves, with varying degrees of ease, essentially querying all the installed RPMs on the LiveCD after-the-fact and generating the list, then grabbing the files etc. All very manual. I expect there's a better way, and I'm even open to helping code it, but am looking for direction from you - those who know the tools best...
We can just use the same concept that pungi uses. The revisor team is actually working on this and when we have progress we'll let the list know. If anyone wants to just get it done, please do.
def getSRPMList(self): """Cycle through the list of package objects and find the sourcerpm for them. Requires yum still configured and a list of package objects"""
for po in self.polist: srpm = po.sourcerpm.split('.src.rpm')[0] if not srpm in self.srpmlist: self.srpmlist.append(srpm)
Start reading at line 327 of /usr/lib/python2.5/site-packages/pypungi/gather.py:
def downloadSRPMs(self): """Cycle through the list of srpms and find the package objects for them, Then download them.""" [...]
Jonathan Steffan
On Mon, 2007-08-13 at 22:11 -0500, Matt Domsch wrote:
I want to be sure, for license compliance, that all the binary bits on the final LiveCD have corresponding source code available.
One of the "features" I'd like to see something in the stack of livecd-tools produce is a CD/DVD/whatever of the SRPMS that match the RPMs that go into the LiveCD. Smooge and I have both done this ourselves, with varying degrees of ease, essentially querying all the installed RPMs on the LiveCD after-the-fact and generating the list, then grabbing the files etc. All very manual. I expect there's a better way, and I'm even open to helping code it, but am looking for direction from you - those who know the tools best...
The big problem is going to be that when we point to a repo, there's no guarantee that there's source in that repo. In fact, with the way we do repos, there isn't. And given that, doing this just within the confines of livecd-creator is going to be tricky.
For the general Fedora case, it's a bit of a moot point (images are created pointing at the pungi'd everything repo which then has its source put up). For more general creation, though, I can see where it could be desired, I just don't see a good general path to get there. :-/
Jeremy
On Tue, 14 Aug 2007 10:34:20 -0400 Jeremy Katz katzj@redhat.com wrote:
The big problem is going to be that when we point to a repo, there's no guarantee that there's source in that repo. In fact, with the way we do repos, there isn't. And given that, doing this just within the confines of livecd-creator is going to be tricky.
Well you'd have to add a second repo to your configuration. Not only that, but you have to "reset" the yum object so that it will actually "see" the source rpms when it comes time to search for them and download them from your package sacks. In pungi I actually just create a second yum object to handle this instead of trying to reset the original object.
Jesse Keating wrote:
On Tue, 14 Aug 2007 10:34:20 -0400 Jeremy Katz katzj@redhat.com wrote:
The big problem is going to be that when we point to a repo, there's no guarantee that there's source in that repo. In fact, with the way we do repos, there isn't. And given that, doing this just within the confines of livecd-creator is going to be tricky.
Well you'd have to add a second repo to your configuration. Not only that, but you have to "reset" the yum object so that it will actually "see" the source rpms when it comes time to search for them and download them from your package sacks. In pungi I actually just create a second yum object to handle this instead of trying to reset the original object.
You could just use the existing yum object; and not reset it.
http://git.fedoraproject.org/?p=hosted/revisor;a=blob;f=revisor/base.py#l569
http://git.fedoraproject.org/?p=hosted/revisor;a=blob;f=revisor/base.py#l589
Kind regards,
Jeroen van Meeuwen -kanarip
On Wed, 15 Aug 2007 16:49:20 +0200 Jeroen van Meeuwen kanarip@kanarip.com wrote:
You could just use the existing yum object; and not reset it.
http://git.fedoraproject.org/?p=hosted/revisor;a=blob;f=revisor/base.py#l569
http://git.fedoraproject.org/?p=hosted/revisor;a=blob;f=revisor/base.py#l589
Interesting. I'll play with this in Pungi.
On Wed, 15 Aug 2007 11:13:12 -0400 Jesse Keating jkeating@redhat.com wrote:
You could just use the existing yum object; and not reset it.
http://git.fedoraproject.org/?p=hosted/revisor;a=blob;f=revisor/base.py#l569
http://git.fedoraproject.org/?p=hosted/revisor;a=blob;f=revisor/base.py#l589
Interesting. I'll play with this in Pungi.
When is the last time you tried this on rawhide? I (and seth) thinks there are some issues with even the basic of 'resetting' the yum object (which you do with the 'self.cfg.yumobj._pkgSack = None'. Further calls to _getSacks return nothing.
Perhaps if yours works and mine doesn't has to do with whether or not the source repo was enabled initially or not. I'm going to play with that some.
On Fri, 17 Aug 2007 09:51:51 -0400 Jesse Keating jkeating@redhat.com wrote:
Perhaps if yours works and mine doesn't has to do with whether or not the source repo was enabled initially or not. I'm going to play with that some.
Yes that's it. Not a huge deal I can work with this.
Jesse Keating wrote:
On Fri, 17 Aug 2007 09:51:51 -0400 Jesse Keating jkeating@redhat.com wrote:
Perhaps if yours works and mine doesn't has to do with whether or not the source repo was enabled initially or not. I'm going to play with that some.
Yes that's it. Not a huge deal I can work with this.
Right so what I'm reading is that the code needs to forcibly disable (not enable) source repositories initially then only enable them in enable_source_repositories() or none of this works?
That's a ticket ;-) [1]
Kind regards,
Jeroen van Meeuwen -kanarip
[1] https://hosted.fedoraproject.org/projects/revisor/ticket/266
On Sun, 19 Aug 2007 00:58:21 +0200 Jeroen van Meeuwen kanarip@kanarip.com wrote:
Right so what I'm reading is that the code needs to forcibly disable (not enable) source repositories initially then only enable them in enable_source_repositories() or none of this works?
That's a ticket ;-) [1]
Yeah, sounds about right. The trick is going to be figuring out what repos are source repos and which aren't, since they can be named anything (and keeping in mind that currently pykickstart just takes repo as an argument, it doesn't have the concept of a source-repo yet. Maybe an RFE?).
Jesse Keating wrote:
On Sun, 19 Aug 2007 00:58:21 +0200 Jeroen van Meeuwen kanarip@kanarip.com wrote:
Right so what I'm reading is that the code needs to forcibly disable (not enable) source repositories initially then only enable them in enable_source_repositories() or none of this works?
That's a ticket ;-) [1]
Yeah, sounds about right. The trick is going to be figuring out what repos are source repos and which aren't, since they can be named anything (and keeping in mind that currently pykickstart just takes repo as an argument, it doesn't have the concept of a source-repo yet. Maybe an RFE?).
One of the things Revisor assumes when forcibly enabling source repositories is that we can just append "-source" to whatever other enabled (non-source) repository (using the 'id' of course).
It somewhat makes sense to require to append -source to a repositories name to indicate it's the repositories source equivalent, doesn't it? Technically though, one would want to be able to distinct a source repo from a non-source repo as obviously like you said they could be named anything.
This makes me wonder if we /really/ need some directive that says the repo is a source repo, or whether we could just figure it out on our own and do something intelligent with it. Like, we could initially only _getSacks for the RPM compat arch list (minus src). Then when we want source, _getSacks for the src arch. Maybe ;-)
This however would require that the user enables the source repositories initially if he/she/it wants the sources to be pulled in, and we neglect that we could be better off by appending -source to indicate it's a source repo equivalent...
I'm not sure what is the best way to go here; specify 'source' somewhere manually -which you do with appending -source to the repos id anyway-, or detect automatically -in which case you would want -source appended but you don't really need it.
Any thoughts on this?
Kind regards,
Jeroen van Meeuwen -kanarip
On Sun, 19 Aug 2007 13:11:46 +0200 Jeroen van Meeuwen kanarip@kanarip.com wrote:
This makes me wonder if we /really/ need some directive that says the repo is a source repo, or whether we could just figure it out on our own and do something intelligent with it. Like, we could initially only _getSacks for the RPM compat arch list (minus src). Then when we want source, _getSacks for the src arch. Maybe ;-)
This was my original plan, except I ran afoul of the inability to "reset" the yum object to be able to get the src arch sacks out of it.
This however would require that the user enables the source repositories initially if he/she/it wants the sources to be pulled in, and we neglect that we could be better off by appending -source to indicate it's a source repo equivalent...
Well, in pungi at least, we just enable all repos forcefully. I imagine the same goes for the livecd-tools, and whatever else moves over to kickstart config systems. Any repo listed via 'repo' would automatically get enabled.
I'm not sure what is the best way to go here; specify 'source' somewhere manually -which you do with appending -source to the repos id anyway-, or detect automatically -in which case you would want -source appended but you don't really need it.
Any thoughts on this?
I'd like to talk to Chris Lumens and Jeremy Katz about the idea of a --source flag to the repo directive in pykickstart. This could store some info in the repo object that could alert pykickstart consumers that the repo in question is designed for source and thus we can deal with it correctly when we want the source vs the binary/noarch. This still requires the user creating the config specify that this repo is --source, but I don't think that this is too much to ask.
On Sun, 19 Aug 2007 07:32:28 -0400 Jesse Keating jkeating@redhat.com wrote:
I'd like to talk to Chris Lumens and Jeremy Katz about the idea of a --source flag to the repo directive in pykickstart. This could store some info in the repo object that could alert pykickstart consumers that the repo in question is designed for source and thus we can deal with it correctly when we want the source vs the binary/noarch. This still requires the user creating the config specify that this repo is --source, but I don't think that this is too much to ask.
I've sent a patch doing exactly this over to anaconda-list (couldn't think of a better place to send pykickstart patches). We'll see what comes of the discussion.
Jesse Keating wrote:
On Sun, 19 Aug 2007 07:32:28 -0400 Jesse Keating jkeating@redhat.com wrote:
I'd like to talk to Chris Lumens and Jeremy Katz about the idea of a --source flag to the repo directive in pykickstart. This could store some info in the repo object that could alert pykickstart consumers that the repo in question is designed for source and thus we can deal with it correctly when we want the source vs the binary/noarch. This still requires the user creating the config specify that this repo is --source, but I don't think that this is too much to ask.
I've sent a patch doing exactly this over to anaconda-list (couldn't think of a better place to send pykickstart patches). We'll see what comes of the discussion.
Maybe the appropriate list is kickstart-list ;-) I hope this comes through, advancing kickstart is a huge deal in all this (e.g. customizing).
Thanks,
Kind regards,
Jeroen van Meeuwen -kanarip
On Sun, 2007-08-19 at 14:12 +0200, Jeroen van Meeuwen wrote:
Jesse Keating wrote:
I've sent a patch doing exactly this over to anaconda-list (couldn't think of a better place to send pykickstart patches). We'll see what comes of the discussion.
Maybe the appropriate list is kickstart-list ;-) I hope this comes through, advancing kickstart is a huge deal in all this (e.g. customizing).
kickstart-list tends to be more focused on use of kickstart (the file format) and not as much the devel side. anaconda-devel-list is as probably as good as anything for that
Jeremy
Jesse Keating wrote:
On Sun, 19 Aug 2007 13:11:46 +0200 Jeroen van Meeuwen kanarip@kanarip.com wrote:
Any thoughts on this?
I'd like to talk to Chris Lumens and Jeremy Katz about the idea of a --source flag to the repo directive in pykickstart. This could store some info in the repo object that could alert pykickstart consumers that the repo in question is designed for source and thus we can deal with it correctly when we want the source vs the binary/noarch. This still requires the user creating the config specify that this repo is --source, but I don't think that this is too much to ask.
For composing tools with just kickstart configuration, I guess yum plugins such as protectbase and fastestmirror are out-of-scope now, would it be worth something to keep these in mind nonetheless? (--protect?) Other directives such as includepkgs or exclude could become worthwhile at some point, too.
Kind regards,
Jeroen van Meeuwen -kanarip
On Sun, 19 Aug 2007 14:06:26 +0200 Jeroen van Meeuwen kanarip@kanarip.com wrote:
For composing tools with just kickstart configuration, I guess yum plugins such as protectbase and fastestmirror are out-of-scope now, would it be worth something to keep these in mind nonetheless? (--protect?) Other directives such as includepkgs or exclude could become worthwhile at some point, too.
Yes, there is value in extending the repo attributes in pykickstart. I've discussed it some with Chris, at least the idea of --exclude and --includepkgs and what they would actually mean in the grand scheme of figuring out the package set. He didn't seem greatly opposed, so long as we keep the semantics clear.
As for plugins, that's a larger question to answer. We may need a plugin to accomplish multilib installs by default in say Fedora 9 as we're discussing taking multilib resolution out of the hands of the compose tools and putting it into the hands of the system actually installing packages so that said system can define what multilib strategy they wish to follow. So if this goes through, there would be some precedence for using yum plugins at install time. How this would be accomplished sanely and extendably I know not. A conversation for another day.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Jesse Keating wrote:
On Sun, 19 Aug 2007 14:06:26 +0200 Jeroen van Meeuwen kanarip@kanarip.com wrote:
For composing tools with just kickstart configuration, I guess yum plugins such as protectbase and fastestmirror are out-of-scope now, would it be worth something to keep these in mind nonetheless? (--protect?) Other directives such as includepkgs or exclude could become worthwhile at some point, too.
Yes, there is value in extending the repo attributes in pykickstart. I've discussed it some with Chris, at least the idea of --exclude and --includepkgs and what they would actually mean in the grand scheme of figuring out the package set. He didn't seem greatly opposed, so long as we keep the semantics clear.
As for plugins, that's a larger question to answer. We may need a plugin to accomplish multilib installs by default in say Fedora 9 as we're discussing taking multilib resolution out of the hands of the compose tools and putting it into the hands of the system actually installing packages so that said system can define what multilib strategy they wish to follow. So if this goes through, there would be some precedence for using yum plugins at install time. How this would be accomplished sanely and extendably I know not. A conversation for another day.
F9 FUDCon here we come ;-)
Kind regards,
Jeroen van Meeuwen - -kanarip
- -- http://www.kanarip.com/ RHCE, LPIC-2, MCP, CCNA C6B0 7FB4 43E6 CDDA D258 F70B 28DE 9FDA 9342 BF08
On Sat, 2007-08-18 at 20:09 -0400, Jesse Keating wrote:
On Sun, 19 Aug 2007 00:58:21 +0200 Jeroen van Meeuwen kanarip@kanarip.com wrote:
Right so what I'm reading is that the code needs to forcibly disable (not enable) source repositories initially then only enable them in enable_source_repositories() or none of this works?
That's a ticket ;-) [1]
Yeah, sounds about right. The trick is going to be figuring out what repos are source repos and which aren't, since they can be named anything (and keeping in mind that currently pykickstart just takes repo as an argument, it doesn't have the concept of a source-repo yet. Maybe an RFE?).
My first question would be "why does it matter?" Why not just have more repos listed and if you're doing something with sources, you deal with the repos and have your arches set to src as opposed to "binary" arches. Sure, it's more metadata, but at the end, you're going to end up churning through it all anyway, so I don't know that it's that large of a cost really
Jeremy
On Mon, 20 Aug 2007 10:51:54 -0400 Jeremy Katz katzj@redhat.com wrote:
My first question would be "why does it matter?" Why not just have more repos listed and if you're doing something with sources, you deal with the repos and have your arches set to src as opposed to "binary" arches. Sure, it's more metadata, but at the end, you're going to end up churning through it all anyway, so I don't know that it's that large of a cost really
Right now? Because yum throws out anything that doesn't match the compat arch list when getting package listings. So you get your package listings from your enabled repos, and it throws out all the source. There doesn't seem to be a good way to 'reset' the object to allow you to bring back in the source packages.
Now you're going to say "fix it in yum instead" and that's fine, that's a reasonable answer. May not be an easy task either.
We could make it not throw out those packages and make consumers of _getSacks do the filtering on their own.
We could try to get yum objects to be able to 'reset' themselves.
We can do the somewhat status quo of Pungi and just create a new yum object, add all the repos again, and do a _getSacks where the archlist is 'src'.
Not sure what the best strategy is. I suppose "working around" it in pykickstart isn't the best.
On Mon, 2007-08-20 at 11:22 -0400, Jesse Keating wrote:
On Mon, 20 Aug 2007 10:51:54 -0400 Jeremy Katz katzj@redhat.com wrote:
My first question would be "why does it matter?" Why not just have more repos listed and if you're doing something with sources, you deal with the repos and have your arches set to src as opposed to "binary" arches. Sure, it's more metadata, but at the end, you're going to end up churning through it all anyway, so I don't know that it's that large of a cost really
Right now? Because yum throws out anything that doesn't match the compat arch list when getting package listings. So you get your package listings from your enabled repos, and it throws out all the source. There doesn't seem to be a good way to 'reset' the object to allow you to bring back in the source packages.
If you know you're going to be using sources initially, you could include src in your arch list. And then do filtering later.
Now you're going to say "fix it in yum instead" and that's fine, that's a reasonable answer. May not be an easy task either.
We could make it not throw out those packages and make consumers of _getSacks do the filtering on their own.
This seems like it's probably the best approach just from 2 seconds worth of thinking about it.
We could try to get yum objects to be able to 'reset' themselves.
How is a reset different, though, than just creating a new object?
We can do the somewhat status quo of Pungi and just create a new yum object, add all the repos again, and do a _getSacks where the archlist is 'src'.
Not sure what the best strategy is. I suppose "working around" it in pykickstart isn't the best.
Yeah, it just feels like it's enforcing things which really don't make a difference for the end-user
Jeremy
On Mon, 20 Aug 2007 11:26:18 -0400 Jeremy Katz katzj@redhat.com wrote:
If you know you're going to be using sources initially, you could include src in your arch list. And then do filtering later.
Now you're going to say "fix it in yum instead" and that's fine, that's a reasonable answer. May not be an easy task either.
We could make it not throw out those packages and make consumers of _getSacks do the filtering on their own.
This seems like it's probably the best approach just from 2 seconds worth of thinking about it.
Which now that I think about it gets us back up to suggestion #1.
We could try to get yum objects to be able to 'reset' themselves.
How is a reset different, though, than just creating a new object?
Well you don't have to feed it all the config stuff over again, don't have to build up the logging system, don't have to re-add all the repos and such. It's a time and memory savor.
I'm going to play with just adding 'src' to my arch list and doing my own filtering for a bit.
Hi Matt,
On Mon, 2007-08-13 at 22:11 -0500, Matt Domsch wrote:
I want to be sure, for license compliance, that all the binary bits on the final LiveCD have corresponding source code available.
One of the "features" I'd like to see something in the stack of livecd-tools produce is a CD/DVD/whatever of the SRPMS that match the RPMs that go into the LiveCD.
One approach could be to first use pungi -G to compose yum repos of the minimal set of RPMs and SRPMs needed for the livecd, and then build the livecd from those repos. You could then ship those repos separately from the LiveCD.
Since they both get their package lists from a kickstart file, that should be pretty straightforward.
Cheers, Mark.
livecd@lists.fedoraproject.org