#206: Update bodhi to enforce Package Update Acceptance policy ----------------------------+----------------------------------------------- Reporter: wwoods | Owner: Type: task | Status: new Priority: major | Milestone: Package Update Acceptance Test Plan Component: infrastructure | Version: 1.0 Keywords: | ----------------------------+----------------------------------------------- In order to enforce the Package Update Acceptance Test Plan, Bodhi will need be modified to reject the push of any package that fails acceptance testing.
The acceptance tests may need to send status/data to bodhi in order for it to make decisions about policy (see e.g. ticket #205). Later bodhi might just get data from resultdb (see that milestone for details).
#206: Update bodhi to enforce Package Update Acceptance policy ------------------------+-------------------------------------------------- Reporter: wwoods | Owner: Type: task | Status: new Priority: major | Milestone: Package Update Acceptance Test Plan Component: production | Resolution: Keywords: | Blocked By: Blocking: | ------------------------+--------------------------------------------------
Comment (by adamwill):
So let's stir the pot a bit:
Are we at the point where we want to do this now?
It was suggested to me on IRC yesterday by someone (I forget who) that maybe we should be enforcing some or all of the AutoQA tests at this point. depcheck seems to be in pretty solid working order. Is there an agreed set of requirements for tests to stop being advisory and start squelching pushes?
#206: Update bodhi to enforce Package Update Acceptance policy ------------------------+-------------------------------------------------- Reporter: wwoods | Owner: Type: task | Status: new Priority: major | Milestone: Package Update Acceptance Test Plan Component: production | Resolution: Keywords: | Blocked By: Blocking: | ------------------------+--------------------------------------------------
Comment (by tflink):
Two questions in response: 1. Is bodhi ready to start enforcing that? 1. I don't think that we currently have a method for overriding failures, was that supposed to be part of AutoQA or bodhi?
On a related note, do we have any data on how effective the checks in AutoQA are? How important do we think it'll be to have said data? I have a feeling that it will be important to have once we hit the first situation where someone challenges the worthwhile-ness of the AutoQA results.
#206: Update bodhi to enforce Package Update Acceptance policy ------------------------+-------------------------------------------------- Reporter: wwoods | Owner: Type: task | Status: new Priority: major | Milestone: Package Update Acceptance Test Plan Component: production | Resolution: Keywords: | Blocked By: Blocking: | ------------------------+--------------------------------------------------
Comment (by adamwill):
I think Bodhi won't do anything until we ask Luke to do it. That would presumably be the next step if we decided we wanted that. Ditto on failure override - it's hard to implement an override for a mechanism that doesn't exist!
I think it would be good to set a threshold of reliability for the tests and gather data to check that, yeah. I really wanted just to kick the ass of this topic to see how to get it moving.
#206: Update bodhi to enforce Package Update Acceptance policy ------------------------+-------------------------------------------------- Reporter: wwoods | Owner: Type: task | Status: new Priority: major | Milestone: Package Update Acceptance Test Plan Component: production | Resolution: Keywords: | Blocked By: Blocking: | ------------------------+--------------------------------------------------
Comment (by kparal):
I occasionally see some questions or downright fixes related to AutoQA results, which gladdens me. But these are the obstacles until we can enable enforcing mode:
1. Depcheck needs to be re-written. Sorry to say it this roughly. Depcheck doesn't work for packages that uses conflicts or obsoletes. Sometimes it doesn't produce any result at all for selected packages. Fortunately these issues concern just a fraction of packages (less than several percent, I'd guess), but they are present. We are eagerly waiting for libsolve to appear.[[BR]][[BR]]OTOH our second test, upgradepath, seems not to suffer from any issues and probably could be enforcing.
2. There is no possibility to override results. That would probably be implemented inside Bodhi, so it's not our problem. But we have to come up with processes - who decides whether you can override, whether you need some whitelist of packages that are often evaluated incorrectly, etc.
3. We need to implement the option to re-run any test from scratch. Because something had failed or there had been a bug in our test, we could have marked some package incorrectly. We (or the maintainer) should be able to re-run it again. But to tell the truth, we probably don't need this option for the current implementations of depcheck and upgradepath, because they try to evaluate all available packages every time. But for some future tests this will be needed for sure.
4. I think deeper integration of Bodhi and our ResultsDB is required. As we were discussing a long time ago, in order to produce trustworthy results we need to operate on package sets, not individual packages. We can say "if you push this set of packages to stable, it won't break". If one of those packages is unpushed, the whole set must be invalidated and new results must be computed. We can't really achieve that just with Bodhi comments. It needs development on our side and on Bodhi side. Releng team must then push just the package set as indicated by Bodhi, nothing more, nothing less.
5. If we start to experiment with enforcing mode, we'll probably discover many more problems on the way, people will start to get louder.
As I currently see it, I would rather keep the permissive/informative mode on, and a) develop the framework the way we need it b) encourage and promote the informative results (making it more maintainer-friendly, improve documentation, improve integration with Bodhi, etc). Our solution would be currently half-baked at best. Our dependency checking is not good enough, and our upgradepath checking is good enough, but misses the infrastructure to make updates pushing reliable.
#206: Update bodhi to enforce Package Update Acceptance policy ------------------------+-------------------------------------------------- Reporter: wwoods | Owner: Type: task | Status: new Priority: major | Milestone: Package Update Acceptance Test Plan Component: production | Resolution: Keywords: | Blocked By: Blocking: | ------------------------+--------------------------------------------------
Comment (by tflink):
Replying to [comment:4 kparal]:
I occasionally see some questions or downright fixes related to AutoQA
results, which gladdens me. But these are the obstacles until we can enable enforcing mode:
- Depcheck needs to be re-written.
Another option would be to have a whitelist of failures that are ignored. I'm not sure that's the best idea, but it would solve the biggest problem that you brought up.
Another thought would be to re-think depcheck. I haven't had the chance to try it out yet due to slight incompatibility with Fedora but EDOS (currently used by debian) is an interesting and very different approach to what we're trying to do with depcheck. [http://arxiv.org/abs/0811.3620 A paper describing the process is available here]
- There is no possibility to override results. That would probably be
implemented inside Bodhi, so it's not our problem. But we have to come up with processes - who decides whether you can override, whether you need some whitelist of packages that are often evaluated incorrectly, etc.
Yeah, we would need this at the very least.
- We need to implement the option to re-run any test from scratch.
I wonder if we can get away with contributors being able to request re- runs instead of needing an interface for them to do it themselves. Eventually, I agree that we need some self-service interface but I'm not as sure it's blocking the enforcement right now.
- I think deeper integration of Bodhi and our ResultsDB is required. As
we were discussing a long time ago, in order to produce trustworthy results we need to operate on package sets, not individual packages. We can say "if you push this set of packages to stable, it won't break". If one of those packages is unpushed, the whole set must be invalidated and new results must be computed. We can't really achieve that just with Bodhi comments. It needs development on our side and on Bodhi side. Releng team must then push just the package set as indicated by Bodhi, nothing more, nothing less.
I'm not quite sure I follow you on this one. Yeah, it would be great to have better integration between ResultsDB and Bodhi but I'm not sure it's a requirement to start enforcing the update plan. We probably need better integration between AutoQA as a whole and Bodhi but I'm not sure how ResultsDB fits in to making sure that an update is re-tested after modification.
- If we start to experiment with enforcing mode, we'll probably
discover many more problems on the way, people will start to get louder.
Well, I think people are going to complain no matter what we do. However, I also think that taking reasonable steps to counter the loudness would be wise on our part.
As I currently see it, I would rather keep the permissive/informative
mode on, and a) develop the framework the way we need it b) encourage and promote the informative results (making it more maintainer-friendly, improve documentation, improve integration with Bodhi, etc).
Yeah, maintainer-friendlyness is something that occurred to me over the weekend. I'm also not sure that we're friendly enough to maintainers in order to start forcing them to use AutoQA results. We could suggest doing so but I imagine that there would be a lot of pushback.
Our solution would be currently half-baked at best. Our dependency
checking is not good enough, and our upgradepath checking is good enough, but misses the infrastructure to make updates pushing reliable.
Well, when do we think that AutoQA would be good enough to start enforcing? I realize that there isn't a good answer to that question - I'm more wondering if we should try to come up with a better answer instead of just "real soon now".
#206: Update bodhi to enforce Package Update Acceptance policy ------------------------+-------------------------------------------------- Reporter: wwoods | Owner: Type: task | Status: new Priority: major | Milestone: Package Update Acceptance Test Plan Component: production | Resolution: Keywords: | Blocked By: Blocking: | ------------------------+--------------------------------------------------
Comment (by jdulaney):
I would say try to make the changes needed to AutoQA be in place for 1.0. By that point, we should be pushing something that is complete enough to enforce at least some tests (even if the enforcement doesn't actually happen, yet). Otherwise, we may want to reconsider the whole project.
#206: Update bodhi to enforce Package Update Acceptance policy ------------------------+-------------------------------------------------- Reporter: wwoods | Owner: Type: task | Status: new Priority: major | Milestone: Package Update Acceptance Test Plan Component: production | Resolution: Keywords: | Blocked By: Blocking: | ------------------------+--------------------------------------------------
Comment (by kparal):
Replying to [comment:5 tflink]:
- Depcheck needs to be re-written.
Another option would be to have a whitelist of failures that are
ignored. I'm not sure that's the best idea, but it would solve the biggest problem that you brought up.
Another thought would be to re-think depcheck. I haven't had the chance
to try it out yet due to slight incompatibility with Fedora but EDOS (currently used by debian) is an interesting and very different approach to what we're trying to do with depcheck. [http://arxiv.org/abs/0811.3620 A paper describing the process is available here]
Any time I read "Installability is an NP-complete problem" I wonder whether I'm the best person to deal with it. I'd much rather convince more knowledgeable people (yum/rpm people) to write this test for us.
- I think deeper integration of Bodhi and our ResultsDB is required.
As we were discussing a long time ago, in order to produce trustworthy results we need to operate on package sets, not individual packages. We can say "if you push this set of packages to stable, it won't break". If one of those packages is unpushed, the whole set must be invalidated and new results must be computed. We can't really achieve that just with Bodhi comments. It needs development on our side and on Bodhi side. Releng team must then push just the package set as indicated by Bodhi, nothing more, nothing less.
I'm not quite sure I follow you on this one. Yeah, it would be great to
have better integration between ResultsDB and Bodhi but I'm not sure it's a requirement to start enforcing the update plan. We probably need better integration between AutoQA as a whole and Bodhi but I'm not sure how ResultsDB fits in to making sure that an update is re-tested after modification.
I might have over-complicated that one, I'm not sure now. I supposed that sometimes it could be necessary for Bodhi to ask ResultsDB about the very current results, instead of relying on AutoQA-provided comments (that might go obsolete after some package is unpushed/edited). It might be the case, it might not. I'd have to think more about it.
Our solution would be currently half-baked at best. Our dependency
checking is not good enough, and our upgradepath checking is good enough, but misses the infrastructure to make updates pushing reliable.
Well, when do we think that AutoQA would be good enough to start
enforcing? I realize that there isn't a good answer to that question - I'm more wondering if we should try to come up with a better answer instead of just "real soon now".
I believe that depends on the priority we give it. How important is for us to go to enforcing mode? Can we gain more by having informational results for now (that might not be as perfect as in enforcing mode) and spend our time elsewhere? That kind of questions.
autoqa-devel@lists.fedorahosted.org