Today when using F17 Alpha, I ran qemu and got an error which was something like:
qemu-kvm: undefined symbol usbredirhost_foo
(I don't recall the precise symbol). This was just because that version of qemu was compiled against a later version of libusbredirhost.so (but one with the same soname), and updating libusbredirhost.so fixed the problem.
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
Rich.
On Thu, Apr 5, 2012 at 8:03 PM, Richard W.M. Jones rjones@redhat.com wrote:
Today when using F17 Alpha, I ran qemu and got an error which was something like:
qemu-kvm: undefined symbol usbredirhost_foo
(I don't recall the precise symbol). This was just because that version of qemu was compiled against a later version of libusbredirhost.so (but one with the same soname), and updating libusbredirhost.so fixed the problem.
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
Well the bug here is that usbredir changed ABI without bumping the soname.
On Thu, 2012-04-05 at 20:21 +0200, drago01 wrote:
On Thu, Apr 5, 2012 at 8:03 PM, Richard W.M. Jones rjones@redhat.com wrote:
Today when using F17 Alpha, I ran qemu and got an error which was something like:
qemu-kvm: undefined symbol usbredirhost_foo
(I don't recall the precise symbol). This was just because that version of qemu was compiled against a later version of libusbredirhost.so (but one with the same soname), and updating libusbredirhost.so fixed the problem.
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
Well the bug here is that usbredir changed ABI without bumping the soname.
This isn't a backwards-incompatible change, as I understand it. It looks like they added a feature to the ABI and qemu started relying on it. So it really is just a case of the qemu package not properly identifying its dependency.
A library only needs to bump it's soname if it changes or removes a public function. Adding a new one is generally fine.
On Thu, Apr 05, 2012 at 08:21:15PM +0200, drago01 wrote:
On Thu, Apr 5, 2012 at 8:03 PM, Richard W.M. Jones rjones@redhat.com wrote:
Today when using F17 Alpha, I ran qemu and got an error which was something like:
qemu-kvm: undefined symbol usbredirhost_foo
(I don't recall the precise symbol). This was just because that version of qemu was compiled against a later version of libusbredirhost.so (but one with the same soname), and updating libusbredirhost.so fixed the problem.
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
Well the bug here is that usbredir changed ABI without bumping the soname.
They don't need to, nor should they, bump the soname when adding a new symbol. If they used symbol versioning, however, then RPM would pick up the deps correctly, because it adds deps on all ELF symbol versions
Daniel
On Thu, Apr 05, 2012 at 08:21:15PM +0200, drago01 wrote:
On Thu, Apr 5, 2012 at 8:03 PM, Richard W.M. Jones rjones@redhat.com wrote:
Today when using F17 Alpha, I ran qemu and got an error which was something like:
qemu-kvm: undefined symbol usbredirhost_foo
(I don't recall the precise symbol). This was just because that version of qemu was compiled against a later version of libusbredirhost.so (but one with the same soname), and updating libusbredirhost.so fixed the problem.
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
Well the bug here is that usbredir changed ABI without bumping the soname.
AIUI you don't need to bump the soname when you add a new function, only if you incompatibly change an existing function or struct.
(Larger questions about the meaning of "ABI" omitted from this message ...)
Rich.
2012/4/6 Richard W.M. Jones rjones@redhat.com:
On Thu, Apr 05, 2012 at 08:21:15PM +0200, drago01 wrote:
On Thu, Apr 5, 2012 at 8:03 PM, Richard W.M. Jones rjones@redhat.com wrote:
Today when using F17 Alpha, I ran qemu and got an error which was something like:
qemu-kvm: undefined symbol usbredirhost_foo
(I don't recall the precise symbol). This was just because that version of qemu was compiled against a later version of libusbredirhost.so (but one with the same soname), and updating libusbredirhost.so fixed the problem.
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
Well the bug here is that usbredir changed ABI without bumping the soname.
AIUI you don't need to bump the soname when you add a new function, only if you incompatibly change an existing function or struct.
(Larger questions about the meaning of "ABI" omitted from this message ...)
So this is already handled by rpm once the symbol is properly tagged as introduced in a given version of the upstream project. see the output of rpm -q --provides libxml2 for an example, glibc does the same.
This move the resolution from "symbols" to "version dependency" which is a reduction. (I guess it wouldn't worth for rpm to explicitly rely on each symbol or that will be huge).
Nicolas (kwizart)
fre 2012-04-06 klockan 10:17 +0100 skrev Richard W.M. Jones:
AIUI you don't need to bump the soname when you add a new function, only if you incompatibly change an existing function or struct.
To clarify (if I understand correctly) the minor version should be incremented in this case, by incrementing current and age in the libtool versioning, but it'd still be libusbredirhost.so.1 so it wouldn't matter with regards to dependencies.
libfoo.so.major.minor.revision corresponds to libtool current:revision:age = major+minor:revision:minor.
http://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.h...
/Alexander
Richard W.M. Jones wrote:
Today when using F17 Alpha, I ran qemu and got an error which was something like:
qemu-kvm: undefined symbol usbredirhost_foo
(I don't recall the precise symbol). This was just because that version of qemu was compiled against a later version of libusbredirhost.so (but one with the same soname), and updating libusbredirhost.so fixed the problem.
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
rpm does handle this, provided the library in question uses symbol versioning.
In this case, I'm guessing that useredir did one or more of the following: 1. broke abi, but didn't bump soname 2. doesn't support symbol versioning at all 3. does do symbol versioning (in general), but botched it wrt usbredirhost_foo
-- rex
Richard W.M. Jones (rjones@redhat.com) said:
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
Unless you redesign the entire way we store, distribute, and process dependency metadata first, doing anything of this sort is just pushing a world of pain onto everyone.
An enterprising soul could calculate how many library:symbol:version entries a typical package would have, and how many of the same a typical library would need to provide. I am not that soul.
This was rejected for the small case of the kernel's modversioned symbols, and that's only one self-contained package.
Bill
On Tue, 2012-04-10 at 22:52 -0400, Bill Nottingham wrote:
Richard W.M. Jones (rjones@redhat.com) said:
The 'qemu' package has a bug, of sorts: the maintainer should have added a specific Requires line:
Requires: usbredir >= <some version>
However, instead of pushing this problem on packagers, maybe RPM should resolve this by encoding the (admittedly long) list of symbols used by a binary?
Unless you redesign the entire way we store, distribute, and process dependency metadata first, doing anything of this sort is just pushing a world of pain onto everyone.
An enterprising soul could calculate how many library:symbol:version entries a typical package would have, and how many of the same a typical library would need to provide. I am not that soul.
Some quick ballpark numbers from F17:
black-lotus:~% symbols_needed_by_package() { function> rpm -ql "$1" | xargs file | grep '<ELF>' | cut -f 1 -d : | function pipe pipe pipe pipe> xargs nm -aDu | sort -u | wc -l function> } black-lotus:~% symbols_needed_by_package xulrunner 3084 black-lotus:~% rpm -q --requires xulrunner | wc -l 114 black-lotus:~% symbols_needed_by_package xorg-x11-server-Xorg 851 black-lotus:~% rpm -q --requires xorg-x11-server-Xorg | wc -l 35
So that's a factor of 25ish more data in the Requires list. No, thanks.
- ajax
On Wed, Apr 11, 2012 at 10:11:40AM -0400, Adam Jackson wrote:
So that's a factor of 25ish more data in the Requires list. No, thanks.
I'm assuming your argument is that you don't want to ship RPMs or repositories where part of them grows to be 25x larger.
But this need not be the case. Observe that the packages already contain the data (in the libraries and binaries themselves).
It may need to be indexed for faster lookups, but that indexing only applies to the "provides" side, ie. libraries. Library symbols are highly regular (eg. containing large common prefixes) so the size of the index won't be as large as the 25x current size.
Rich.
On Wed, Apr 11, 2012 at 03:49:29PM +0100, Richard W.M. Jones wrote:
On Wed, Apr 11, 2012 at 10:11:40AM -0400, Adam Jackson wrote:
So that's a factor of 25ish more data in the Requires list. No, thanks.
I'm assuming your argument is that you don't want to ship RPMs or repositories where part of them grows to be 25x larger.
But this need not be the case. Observe that the packages already contain the data (in the libraries and binaries themselves).
That data is in the RPM payload though. The YUM depsolving code does not have any of the RPM payloads available - it is still trying to figure out which it needs. So at least the YUM repodata will grow in size significantly, even if the RPMs themselves did not.
Daniel
On Wed, Apr 11, 2012 at 03:53:18PM +0100, Daniel P. Berrange wrote:
On Wed, Apr 11, 2012 at 03:49:29PM +0100, Richard W.M. Jones wrote:
On Wed, Apr 11, 2012 at 10:11:40AM -0400, Adam Jackson wrote:
So that's a factor of 25ish more data in the Requires list. No, thanks.
I'm assuming your argument is that you don't want to ship RPMs or repositories where part of them grows to be 25x larger.
But this need not be the case. Observe that the packages already contain the data (in the libraries and binaries themselves).
That data is in the RPM payload though. The YUM depsolving code does not have any of the RPM payloads available - it is still trying to figure out which it needs. So at least the YUM repodata will grow in size significantly, even if the RPMs themselves did not.
I'm not arguing that's how yum works now, but it doesn't have to work that way!
It could incrementally download the RPMs during depsolving, test that they work together, and with that information download further packages as necessary ...
Yes, I know, unless I write the code, I can shut up :-)
Rich.
Richard W.M. Jones wrote:
I'm not arguing that's how yum works now, but it doesn't have to work that way!
It could incrementally download the RPMs during depsolving, test that they work together, and with that information download further packages as necessary ...
I don't think that's practical, at all. For one, it'd mean a lot of wasted downloading if the transaction turns out to be unresolvable. There are probably also other practical issues with that idea.
Kevin Kofler
On Thu, Apr 12, 2012 at 02:40:36AM +0200, Kevin Kofler wrote:
Richard W.M. Jones wrote:
I'm not arguing that's how yum works now, but it doesn't have to work that way!
It could incrementally download the RPMs during depsolving, test that they work together, and with that information download further packages as necessary ...
I don't think that's practical, at all. For one, it'd mean a lot of wasted downloading if the transaction turns out to be unresolvable. There are probably also other practical issues with that idea.
Transactions shouldn't ever be unresolvable (barring gross errors such as network being completely unavailable or every mirror being down).
I think it should be possible to make repos that are always self-consistent even when mirrors only partially mirror or delay content. I have in mind a great proof of this, but this email is too small to contain it.
Rich.
Richard W.M. Jones wrote:
I think it should be possible to make repos that are always self-consistent even when mirrors only partially mirror or delay content. I have in mind a great proof of this, but this email is too small to contain it.
Even across repositories? (RPM Fusion…)
Kevin Kofler
On Thu, Apr 12, 2012 at 05:18:02PM +0200, Kevin Kofler wrote:
Richard W.M. Jones wrote:
I think it should be possible to make repos that are always self-consistent even when mirrors only partially mirror or delay content. I have in mind a great proof of this, but this email is too small to contain it.
Even across repositories? (RPM Fusion…)
Hmm, that is trickier ... The particular problem with RPM Fusion is that no coordination is possible, or even legally permitted.
Rich.
On Thu, 12 Apr 2012 21:39:16 +0100 "Richard W.M. Jones" rjones@redhat.com wrote:
On Thu, Apr 12, 2012 at 05:18:02PM +0200, Kevin Kofler wrote:
Richard W.M. Jones wrote:
I think it should be possible to make repos that are always self-consistent even when mirrors only partially mirror or delay content. I have in mind a great proof of this, but this email is too small to contain it.
Even across repositories? (RPM Fusion…)
Hmm, that is trickier ... The particular problem with RPM Fusion is that no coordination is possible, or even legally permitted.
And there are a billion local, private repos like this.
You cannot expect that fedora will EVER be able to calculate against all of them. And therefore yum has to be able to handle depresolution failure.
-sv
On Thu, Apr 12, 2012 at 04:40:14PM -0400, seth vidal wrote:
On Thu, 12 Apr 2012 21:39:16 +0100 "Richard W.M. Jones" rjones@redhat.com wrote:
On Thu, Apr 12, 2012 at 05:18:02PM +0200, Kevin Kofler wrote:
Richard W.M. Jones wrote:
I think it should be possible to make repos that are always self-consistent even when mirrors only partially mirror or delay content. I have in mind a great proof of this, but this email is too small to contain it.
Even across repositories? (RPM Fusion…)
Hmm, that is trickier ... The particular problem with RPM Fusion is that no coordination is possible, or even legally permitted.
And there are a billion local, private repos like this.
You cannot expect that fedora will EVER be able to calculate against all of them. And therefore yum has to be able to handle depresolution failure.
The only bad thing that happens is that yum downloads some RPMs which it then can't install, so it in an edge case it's downloading too much. Normally the RPMs it downloads to find the dependencies are ones it will subsequently install, so there is (in the normal, common case) no overhead.
Rich.
On 04/12/2012 01:54 PM, Richard W.M. Jones wrote:
On Thu, Apr 12, 2012 at 04:40:14PM -0400, seth vidal wrote:
On Thu, 12 Apr 2012 21:39:16 +0100 "Richard W.M. Jones" rjones@redhat.com wrote:
On Thu, Apr 12, 2012 at 05:18:02PM +0200, Kevin Kofler wrote:
Richard W.M. Jones wrote:
I think it should be possible to make repos that are always self-consistent even when mirrors only partially mirror or delay content. I have in mind a great proof of this, but this email is too small to contain it.
Even across repositories? (RPM Fusion…)
Hmm, that is trickier ... The particular problem with RPM Fusion is that no coordination is possible, or even legally permitted.
And there are a billion local, private repos like this.
You cannot expect that fedora will EVER be able to calculate against all of them. And therefore yum has to be able to handle depresolution failure.
The only bad thing that happens is that yum downloads some RPMs which it then can't install, so it in an edge case it's downloading too much. Normally the RPMs it downloads to find the dependencies are ones it will subsequently install, so there is (in the normal, common case) no overhead.
If you're always running "yum -y", then there's probably little difference. Otherwise, downloading while resolving will add significant delay between starting the command and confirming "y" to actually run it.
Anyone on a slow connection will curse you for this, and even on my fast connection, it takes a bit to download hundreds of MB for debuginfo updates...
Josh
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
On Thu, Apr 12, 2012 at 09:03:36 GMT, Richard W.M. Jones wrote:
I think it should be possible to make repos that are always self-consistent even when mirrors only partially mirror or delay content. I have in mind a great proof of this, but this email is too small to contain it.
Rob Escriva and I started CHASM, but since we've graduated, not much work has been done on it.
Simplistically, it cuts a line between rsync and bittorrent. Downloads blobs by hash, and then hardlinks them into the tree. Mirrors would then get swarming behavior (there would still be tiers; the first-class mirrors would be seeded first then "manifests"[1] would be released to public mirrors). Only once a mirror has a complete update would it publish the new packages in the public tree.
We talked at FUDCon Toronto with warthog9, Matt Domsch, and Seth Vidal and they liked it. Unfortunately, not much time has been spent on the project much from either of us of late.
- --Ben
[1]These are be GPG-signed and verified lists of every directory, symlink, and blob in the tree. Mirrors could blacklist paths they don't want to sync (version, arch, debuginfo, whatever).
On Wed, Apr 11, 2012 at 8:01 PM, Richard W.M. Jones rjones@redhat.com wrote:
On Wed, Apr 11, 2012 at 03:53:18PM +0100, Daniel P. Berrange wrote:
On Wed, Apr 11, 2012 at 03:49:29PM +0100, Richard W.M. Jones wrote:
On Wed, Apr 11, 2012 at 10:11:40AM -0400, Adam Jackson wrote:
So that's a factor of 25ish more data in the Requires list. No, thanks.
I'm assuming your argument is that you don't want to ship RPMs or repositories where part of them grows to be 25x larger.
But this need not be the case. Observe that the packages already contain the data (in the libraries and binaries themselves).
That data is in the RPM payload though. The YUM depsolving code does not have any of the RPM payloads available - it is still trying to figure out which it needs. So at least the YUM repodata will grow in size significantly, even if the RPMs themselves did not.
I'm not arguing that's how yum works now, but it doesn't have to work that way!
It could incrementally download the RPMs during depsolving, test that they work together, and with that information download further packages as necessary ...
Ugh no ... the whole point of the repodata is to avoid having to download the rpms to calculate deps.
On Sat, Apr 14, 2012 at 06:21:15PM +0200, drago01 wrote:
On Wed, Apr 11, 2012 at 8:01 PM, Richard W.M. Jones rjones@redhat.com wrote:
On Wed, Apr 11, 2012 at 03:53:18PM +0100, Daniel P. Berrange wrote:
On Wed, Apr 11, 2012 at 03:49:29PM +0100, Richard W.M. Jones wrote:
On Wed, Apr 11, 2012 at 10:11:40AM -0400, Adam Jackson wrote:
So that's a factor of 25ish more data in the Requires list. No, thanks.
I'm assuming your argument is that you don't want to ship RPMs or repositories where part of them grows to be 25x larger.
But this need not be the case. Observe that the packages already contain the data (in the libraries and binaries themselves).
That data is in the RPM payload though. The YUM depsolving code does not have any of the RPM payloads available - it is still trying to figure out which it needs. So at least the YUM repodata will grow in size significantly, even if the RPMs themselves did not.
I'm not arguing that's how yum works now, but it doesn't have to work that way!
It could incrementally download the RPMs during depsolving, test that they work together, and with that information download further packages as necessary ...
Ugh no ... the whole point of the repodata is to avoid having to download the rpms to calculate deps.
Well the "whole" point is to get the best possible software quality, user experience and performance (accepting that we cannot maximize all of these at the same time). It's my personal opinion that yum does not do well on any of these three criteria.
Rich.
Am 14.04.2012 18:39, schrieb Richard W.M. Jones:
On Sat, Apr 14, 2012 at 06:21:15PM +0200, drago01 wrote:
On Wed, Apr 11, 2012 at 8:01 PM, Richard W.M. Jones rjones@redhat.com wrote:
I'm not arguing that's how yum works now, but it doesn't have to work that way!
It could incrementally download the RPMs during depsolving, test that they work together, and with that information download further packages as necessary ...
Ugh no ... the whole point of the repodata is to avoid having to download the rpms to calculate deps.
Well the "whole" point is to get the best possible software quality, user experience and performance (accepting that we cannot maximize all of these at the same time). It's my personal opinion that yum does not do well on any of these three criteria.
and you think performance and user experience will get better by downloading packages for dep-solve?
are you aware that many people do not have endless bandwith, traffic-limuts and storage and can you imagine how slow this all would be?
yum should not waste ressources which i did even in the recent past by consuming wy too much memory resulting get killed from oom-killer on machines with 512 MB RAM
and yes, 512 MB RAM are really enough for many servers and there is no argumentation for a UPDATER eating more ressources as the whole server in normal operations
14.04.2012 20:44, Reindl Harald wrote:
Am 14.04.2012 18:39, schrieb Richard W.M. Jones:
On Sat, Apr 14, 2012 at 06:21:15PM +0200, drago01 wrote:
On Wed, Apr 11, 2012 at 8:01 PM, Richard W.M. Jonesrjones@redhat.com wrote:
I'm not arguing that's how yum works now, but it doesn't have to work that way!
It could incrementally download the RPMs during depsolving, test that they work together, and with that information download further packages as necessary ...
Ugh no ... the whole point of the repodata is to avoid having to download the rpms to calculate deps.
Well the "whole" point is to get the best possible software quality, user experience and performance (accepting that we cannot maximize all of these at the same time). It's my personal opinion that yum does not do well on any of these three criteria.
and you think performance and user experience will get better by downloading packages for dep-solve?
are you aware that many people do not have endless bandwith, traffic-limuts and storage and can you imagine how slow this all would be?
yum should not waste ressources which i did even in the recent past by consuming wy too much memory resulting get killed from oom-killer on machines with 512 MB RAM
and yes, 512 MB RAM are really enough for many servers and there is no argumentation for a UPDATER eating more ressources as the whole server in normal operations
What the point always store it in XML or Sqlite static files instead of provide service on server side to speedup solving? Off course it may require some script running on server side to provide such service and some limit mirroring (there may be fallback to old scheme), but also it may have many benefits: 1) On server side metadata may be any size, optimized for inner use if it will not intended to transfer each time. 2) It may be cached. 3) Clients may ask only small parts of data, which is most cases is what wanted.
As I look it for me for first glance. Install or update one package scenario (yum install foo): 1) Client ask last foo package version. 2) Server calculate all dependencies by self algorithms and return in requested form (several may be used from JSON to XML) full list of dependencies for that package. No other overhead provided like dependencies of all packages, filelist etc. 3) Client got it, intercept with current installed packages list, exclude whats already satisfy needs, and then request each other what does not present starting from 1.
Update scenario (yum update): 1) Client ask repo-server to get a list of actual versions available packages. 2) Server answer it. 3) Client found which updated and request its as in first scenario for update.
On Sun, Apr 15, 2012 at 11:16:58PM +0400, Pavel Alexeev wrote:
As I look it for me for first glance. Install or update one package scenario (yum install foo):
- Client ask last foo package version.
- Server calculate all dependencies by self algorithms and return in
requested form (several may be used from JSON to XML) full list of dependencies for that package. No other overhead provided like dependencies of all packages, filelist etc. 3) Client got it, intercept with current installed packages list, exclude whats already satisfy needs, and then request each other what does not present starting from 1.
Update scenario (yum update):
- Client ask repo-server to get a list of actual versions available
packages. 2) Server answer it. 3) Client found which updated and request its as in first scenario for update.
I don't think this would be a speedup. Instead of the CPUs of tens of thousands of computers doing the depsolving, you'd be requiring the CPUs of a single site to do it. The clients would have to upload, the provides of their installed packages so bandwidth needs might increase. If I was installing a few packages by trial and error/memory I'd likely do yum install tmux followed closely by yum install zsh, which would require separate requests to the server to download separate dependency information as opposed to having the information downloaded once. The server that constructs the subsets of repodata would become single point of failures whereas currently the repodata can be hosted on any mirror. This setup would be much more sensitive to mirrors and repodata going out of sync. There'd likely be times when a new push has gone out where the primary mirror was the only server which could push packages out as every other mirror would be out of sync wrt the repodata server.
-Toshio
16.04.2012 00:51, Toshio Kuratomi wrote:
On Sun, Apr 15, 2012 at 11:16:58PM +0400, Pavel Alexeev wrote:
As I look it for me for first glance. Install or update one package scenario (yum install foo):
- Client ask last foo package version.
- Server calculate all dependencies by self algorithms and return in
requested form (several may be used from JSON to XML) full list of dependencies for that package. No other overhead provided like dependencies of all packages, filelist etc. 3) Client got it, intercept with current installed packages list, exclude whats already satisfy needs, and then request each other what does not present starting from 1.
Update scenario (yum update):
- Client ask repo-server to get a list of actual versions available
packages. 2) Server answer it. 3) Client found which updated and request its as in first scenario for update.
I don't think this would be a speedup. Instead of the CPUs of tens of thousands of computers doing the depsolving, you'd be requiring the CPUs of a single site to do it.
Yes. And as many clients do the same work, caching will give there good results. So, sequence requests will costs nothing.
The clients would have to upload, the provides of their installed packages so bandwidth needs might increase. If I was installing a few packages by trial and error/memory I'd likely do yum install tmux followed closely by yum install zsh, which would require separate requests to the server to download separate dependency information as opposed to having the information downloaded once.
If you request yum install tmux zsh off course it should be sent and calculated on server in one request. Also caching answers on client side not forbidden.
The server that constructs the subsets of repodata would become single point of failures whereas currently the repodata can be hosted on any mirror. This setup would be much more sensitive to mirrors and repodata going out of sync. There'd likely be times when a new push has gone out where the primary mirror was the only server which could push packages out as every other mirror would be out of sync wrt the repodata server.
Yes, as I wrote initially it introduce more requirements to the server, especially some sort of scripting allowed (php, perl, python, ruby or other). But at all it is not exclude mirroring as it is free software and any ma install it, and sync metadata information in traditional way.
-Toshio
On Mon, Apr 16, 2012 at 02:02:31AM +0400, Pavel Alexeev wrote:
16.04.2012 00:51, Toshio Kuratomi wrote:
On Sun, Apr 15, 2012 at 11:16:58PM +0400, Pavel Alexeev wrote:
As I look it for me for first glance. Install or update one package scenario (yum install foo):
- Client ask last foo package version.
- Server calculate all dependencies by self algorithms and return in
requested form (several may be used from JSON to XML) full list of dependencies for that package. No other overhead provided like dependencies of all packages, filelist etc. 3) Client got it, intercept with current installed packages list, exclude whats already satisfy needs, and then request each other what does not present starting from 1.
Update scenario (yum update):
- Client ask repo-server to get a list of actual versions available
packages. 2) Server answer it. 3) Client found which updated and request its as in first scenario for update.
I don't think this would be a speedup. Instead of the CPUs of tens of thousands of computers doing the depsolving, you'd be requiring the CPUs of a single site to do it.
Yes. And as many clients do the same work, caching will give there good results. So, sequence requests will costs nothing.
No. Most requests will be different because they have a different initial state.
The clients would have to upload, the provides of their installed packages so bandwidth needs might increase. If I was installing a few packages by trial and error/memory I'd likely do yum install tmux followed closely by yum install zsh, which would require separate requests to the server to download separate dependency information as opposed to having the information downloaded once.
If you request yum install tmux zsh off course it should be sent and calculated on server in one request. Also caching answers on client side not forbidden.
But I was saying that I often do yum install zsh followed by yum install tmux as I notice things that I've forgotten to install on a machine I'm starting to work on. Similarly, yum install libfoo-devel followed by yum install libbar-devel as I'm pulling the dependencies for a new piece of software I'm building or developing. Caching on the client won't help if what we're caching is only a partial piece of the repodata.
The server that constructs the subsets of repodata would become single point of failures whereas currently the repodata can be hosted on any mirror. This setup would be much more sensitive to mirrors and repodata going out of sync. There'd likely be times when a new push has gone out where the primary mirror was the only server which could push packages out as every other mirror would be out of sync wrt the repodata server.
Yes, as I wrote initially it introduce more requirements to the server, especially some sort of scripting allowed (php, perl, python, ruby or other). But at all it is not exclude mirroring as it is free software and any ma install it, and sync metadata information in traditional way.
If you're requiring that mirrors run the script on their systems, then that makes this idea pretty much a non-starter. We've been told by mirrors that they do not want to do this.
-Toshio
16.04.2012 09:33, Toshio Kuratomi написал:
On Mon, Apr 16, 2012 at 02:02:31AM +0400, Pavel Alexeev wrote:
16.04.2012 00:51, Toshio Kuratomi wrote:
On Sun, Apr 15, 2012 at 11:16:58PM +0400, Pavel Alexeev wrote:
As I look it for me for first glance. Install or update one package scenario (yum install foo):
- Client ask last foo package version.
- Server calculate all dependencies by self algorithms and return in
requested form (several may be used from JSON to XML) full list of dependencies for that package. No other overhead provided like dependencies of all packages, filelist etc. 3) Client got it, intercept with current installed packages list, exclude whats already satisfy needs, and then request each other what does not present starting from 1.
Update scenario (yum update):
- Client ask repo-server to get a list of actual versions available
packages. 2) Server answer it. 3) Client found which updated and request its as in first scenario for update.
I don't think this would be a speedup. Instead of the CPUs of tens of thousands of computers doing the depsolving, you'd be requiring the CPUs of a single site to do it.
Yes. And as many clients do the same work, caching will give there good results. So, sequence requests will costs nothing.
No. Most requests will be different because they have a different initial state.
If you read my suggestion I do not suggest send from client big upload overhead of current installed state. Client ask from server full dependency which package have, and then intercept answer with current installed software. So, answer for each client for package will be the same. Additionally it still allow resolve dependencies with several enabled repositories when some dependencies can't be resolved on one server repo.
The server that constructs the subsets of repodata would become single point of failures whereas currently the repodata can be hosted on any mirror. This setup would be much more sensitive to mirrors and repodata going out of sync. There'd likely be times when a new push has gone out where the primary mirror was the only server which could push packages out as every other mirror would be out of sync wrt the repodata server.
Yes, as I wrote initially it introduce more requirements to the server, especially some sort of scripting allowed (php, perl, python, ruby or other). But at all it is not exclude mirroring as it is free software and any ma install it, and sync metadata information in traditional way.
If you're requiring that mirrors run the script on their systems, then that makes this idea pretty much a non-starter. We've been told by mirrors that they do not want to do this.
For that mirror traditional fallback scheme will be available. But if that will be implemented, I think appeared new mirrors also. And client may prefer one or another type depending by they needs. Additional it can be implemented as only solver mirror to serve requests and then point to download on other(s) mirrors. In that case requirements will be small and it may be hosted even on shared hosting. So, I too can provide that mirror.
-Toshio
On Sat, Apr 14, 2012 at 6:39 PM, Richard W.M. Jones rjones@redhat.com wrote:
On Sat, Apr 14, 2012 at 06:21:15PM +0200, drago01 wrote:
On Wed, Apr 11, 2012 at 8:01 PM, Richard W.M. Jones rjones@redhat.com wrote:
On Wed, Apr 11, 2012 at 03:53:18PM +0100, Daniel P. Berrange wrote:
On Wed, Apr 11, 2012 at 03:49:29PM +0100, Richard W.M. Jones wrote:
On Wed, Apr 11, 2012 at 10:11:40AM -0400, Adam Jackson wrote:
So that's a factor of 25ish more data in the Requires list. No, thanks.
I'm assuming your argument is that you don't want to ship RPMs or repositories where part of them grows to be 25x larger.
But this need not be the case. Observe that the packages already contain the data (in the libraries and binaries themselves).
That data is in the RPM payload though. The YUM depsolving code does not have any of the RPM payloads available - it is still trying to figure out which it needs. So at least the YUM repodata will grow in size significantly, even if the RPMs themselves did not.
I'm not arguing that's how yum works now, but it doesn't have to work that way!
It could incrementally download the RPMs during depsolving, test that they work together, and with that information download further packages as necessary ...
Ugh no ... the whole point of the repodata is to avoid having to download the rpms to calculate deps.
Well the "whole" point is to get the best possible software quality, user experience and performance (accepting that we cannot maximize all of these at the same time). It's my personal opinion that yum does not do well on any of these three criteria.
OK, but your suggestion does not really make the overall experience any better (it does the opposite).
On Wednesday, 11 בApril 2012 17:49:29 Richard W.M. Jones wrote:
On Wed, Apr 11, 2012 at 10:11:40AM -0400, Adam Jackson wrote:
So that's a factor of 25ish more data in the Requires list. No, thanks.
I'm assuming your argument is that you don't want to ship RPMs or repositories where part of them grows to be 25x larger.
I may be wrong, but I think 25x number of nodes in the dependency graph would kill us faster than the size of RPM/yum metadata. Can our SAT-solvers handle this increase at all?
On Thu, Apr 12, 2012 at 10:08:24AM +0300, Oron Peled wrote:
On Wednesday, 11 ??April 2012 17:49:29 Richard W.M. Jones wrote:
On Wed, Apr 11, 2012 at 10:11:40AM -0400, Adam Jackson wrote:
So that's a factor of 25ish more data in the Requires list. No, thanks.
I'm assuming your argument is that you don't want to ship RPMs or repositories where part of them grows to be 25x larger.
I may be wrong, but I think 25x number of nodes in the dependency graph would kill us faster than the size of RPM/yum metadata. Can our SAT-solvers handle this increase at all?
The solving itself isn't really affected by this, because the SAT rules only contain package<->package relationships, thus no dependencies. The rule creation will be the part that's slower. And you'll need much more memory...
Cheers, Michael.