Hi, folks! It's been a recurring issue in the blocker review / release validation process in recent times that we run across bugs that qualify as blockers, but for which the fix does not need to be in the final frozen media or install trees.
Common cases are bugs related to upgrading, especially since the introduction of fedup and even more so of dnf-system-upgrade; most upgrade-related issues can now be sufficiently fixed by package updates to either the source or target release. Bugs to do with writing USB media from the previous release, for instance, also often fall in this category.
Up until now we've been sort of handwaving these as 'special blockers', following the regular blocker process but noting in comments that they don't block the compose. We haven't been tracking very hard if they actually *are* being fixed with updates promptly, we've just been sort of waving a magic wand and assuming it will happen. I just found one which was supposed to be fixed with a 0-day update for Beta, but hadn't been fixed yet: https://bugzilla.redhat.com/show_bug.cgi?id=1263230
So, we kinda need to do this better.
Up top I'd like to note there are really kind of two buckets of 'special blockers' for any given release. If the release being validated is N, they are:
1) Bugs for which the fix must be in the 0-day update set for N 2) Bugs for which the fix must be stable in N-1 and N-2 by N release day
There will be a lot of process nerd detail involved in any fix, but before any detailed drafts I'd like to suggest two broad possible approaches and see what people think:
#1 Separate trackers --------------------
As a sort of on-the-spot PoC for F23 Beta, I created a new tracker bug for bucket 1: https://bugzilla.redhat.com/show_bug.cgi?id=1264167
We could formalize that approach, and have a '0-day' blocker tracker for each milestone. We could either just have one '0-day' tracker and throw bugs for both the pending release and previous stable releases on the same tracker and keep track of what needs updating where in each bug, or we could have two 0-day trackers for each milestone, one for the pending release, and one for previous stable releases.
So we'd have something like:
F24AlphaBlocker F24AlphaFreezeException F24Alpha0Day (F24AlphaStable) (? - better alias suggestions welcome)
and so on. This is a lot of bugs, but there is a script to create them, so we're not adding a bunch of onerous work.
So far as proposing bugs goes, I think we'd probably want to extend the current somewhat flexible approach; formally you would nominate a bug as a particular type of blocker/FE (by marking it as blocking the appropriate tracker), but we would move things around in blocker review meetings sensibly, as we currently do (when something is nominated as FE but should really be a blocker, or vice versa). In blocker review we'd go through all bugs nominated for any of the trackers, probably starting with 'Blocker', then '0Day' and 'Stable', then 'FreezeException'.
#2 MOAR METADATA ----------------
The alternative is to make the existing Blocker trackers do more work. In this model we wouldn't add any new tracker bugs; we'd just add new 'magic words' in the Whiteboard field. Right now, an accepted blocker is identified by the string 'AcceptedBlocker' appearing in the whiteboard field. We could simply add some more magical strings like that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions welcome).
I kind of like this idea as it's less change and involves creating fewer new bugs. We'd have to make some changes to blockerbugs either way - tflink can say if either approach would be more work in blockerbugs, but I'm gonna guess they'd be fairly similar.
With either approach, the basic goal is to make it more feasible to keep an eye on each of the different categories of 'release blocker' bugs; right now we have solid processes in place for ensuring the 'normal' blockers are all addressed in the release media, but we don't have any processes in place for ensuring 0Day and Stable bugs actually get updates shipped when we say they must.
My suggestion would be that we make sure 'blockerbugs' includes lists of each type of blocker. Ahead of and at Go/No-Go meetings, we would want to have a formal assurance from the person responsible for fixing the bug that the fix would be provided by a certain time - say, one day or two days ahead of the release date - and it would be QA's responsibility to ensure the updates are tested promptly, and releng's responsibility to ensure they are pushed on time after being tested. I would suggest the Program Manager ought to have overall responsibility for keeping an eye on the 0Day and Stable blocker lists and making sure the maintainer, QA, and releng all did their jobs on time.
It'd be great if folks could post their general thoughts on this, and any preference for option 1 or option 2. Thanks!
On 11/19/2015 06:06 AM, Adam Williamson wrote:
Hi, folks! It's been a recurring issue in the blocker review / release validation process in recent times that we run across bugs that qualify as blockers, but for which the fix does not need to be in the final frozen media or install trees.
Common cases are bugs related to upgrading, especially since the introduction of fedup and even more so of dnf-system-upgrade; most upgrade-related issues can now be sufficiently fixed by package updates to either the source or target release. Bugs to do with writing USB media from the previous release, for instance, also often fall in this category.
Up until now we've been sort of handwaving these as 'special blockers', following the regular blocker process but noting in comments that they don't block the compose. We haven't been tracking very hard if they actually *are* being fixed with updates promptly, we've just been sort of waving a magic wand and assuming it will happen. I just found one which was supposed to be fixed with a 0-day update for Beta, but hadn't been fixed yet: https://bugzilla.redhat.com/show_bug.cgi?id=1263230
So, we kinda need to do this better.
Up top I'd like to note there are really kind of two buckets of 'special blockers' for any given release. If the release being validated is N, they are:
- Bugs for which the fix must be in the 0-day update set for N
- Bugs for which the fix must be stable in N-1 and N-2 by N release day
There will be a lot of process nerd detail involved in any fix, but before any detailed drafts I'd like to suggest two broad possible approaches and see what people think:
#1 Separate trackers
As a sort of on-the-spot PoC for F23 Beta, I created a new tracker bug for bucket 1: https://bugzilla.redhat.com/show_bug.cgi?id=1264167
We could formalize that approach, and have a '0-day' blocker tracker for each milestone. We could either just have one '0-day' tracker and throw bugs for both the pending release and previous stable releases on the same tracker and keep track of what needs updating where in each bug, or we could have two 0-day trackers for each milestone, one for the pending release, and one for previous stable releases.
So we'd have something like:
F24AlphaBlocker F24AlphaFreezeException F24Alpha0Day (F24AlphaStable) (? - better alias suggestions welcome)
and so on. This is a lot of bugs, but there is a script to create them, so we're not adding a bunch of onerous work.
So far as proposing bugs goes, I think we'd probably want to extend the current somewhat flexible approach; formally you would nominate a bug as a particular type of blocker/FE (by marking it as blocking the appropriate tracker), but we would move things around in blocker review meetings sensibly, as we currently do (when something is nominated as FE but should really be a blocker, or vice versa). In blocker review we'd go through all bugs nominated for any of the trackers, probably starting with 'Blocker', then '0Day' and 'Stable', then 'FreezeException'.
#2 MOAR METADATA
The alternative is to make the existing Blocker trackers do more work. In this model we wouldn't add any new tracker bugs; we'd just add new 'magic words' in the Whiteboard field. Right now, an accepted blocker is identified by the string 'AcceptedBlocker' appearing in the whiteboard field. We could simply add some more magical strings like that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions welcome).
I kind of like this idea as it's less change and involves creating fewer new bugs. We'd have to make some changes to blockerbugs either way - tflink can say if either approach would be more work in blockerbugs, but I'm gonna guess they'd be fairly similar.
With either approach, the basic goal is to make it more feasible to keep an eye on each of the different categories of 'release blocker' bugs; right now we have solid processes in place for ensuring the 'normal' blockers are all addressed in the release media, but we don't have any processes in place for ensuring 0Day and Stable bugs actually get updates shipped when we say they must.
My suggestion would be that we make sure 'blockerbugs' includes lists of each type of blocker. Ahead of and at Go/No-Go meetings, we would want to have a formal assurance from the person responsible for fixing the bug that the fix would be provided by a certain time - say, one day or two days ahead of the release date - and it would be QA's responsibility to ensure the updates are tested promptly, and releng's responsibility to ensure they are pushed on time after being tested. I would suggest the Program Manager ought to have overall responsibility for keeping an eye on the 0Day and Stable blocker lists and making sure the maintainer, QA, and releng all did their jobs on time.
It'd be great if folks could post their general thoughts on this, and any preference for option 1 or option 2. Thanks!
I suggest we have only one ZeroDay i.e., for Final and do away with intermediate ones. As I see it, ZeroDay comes with cost and we also need to have basic sanity testcases automated to ensure ZeroDay fixes won't introduce/regress blocker.
How about automatically qualifying any freeze exception in current phase as blocker for next phase and keep 0day only for RC? AlphaBlocker --> AlphaFreezeException --> BetaBlocker --> BetaFreezeException --> FinalBlocker --> FinalBlockerException --> ZeroDay
This would mean we will be not so liberal in allowing blockers linger around in a phase for more time, but I think that is okay tradeoff.
From tracking perspective, I think we may just want to have trackers for phaseBlocker for each milestone and FinalBlocker and 0Day for Final along with backPortfix tracker one for the pending release, and one for previous stable releases.
Regards, Sudhir
On Thu, 2015-11-19 at 19:09 +0530, Sudhir D wrote:
I suggest we have only one ZeroDay i.e., for Final and do away with intermediate ones. As I see it, ZeroDay comes with cost and we also need to have basic sanity testcases automated to ensure ZeroDay fixes won't introduce/regress blocker.
How about automatically qualifying any freeze exception in current phase as blocker for next phase and keep 0day only for RC? AlphaBlocker --> AlphaFreezeException --> BetaBlocker --> BetaFreezeException --> FinalBlocker --> FinalBlockerException --> ZeroDay
This would mean we will be not so liberal in allowing blockers linger around in a phase for more time, but I think that is okay tradeoff.
From tracking perspective, I think we may just want to have trackers for phaseBlocker for each milestone and FinalBlocker and 0Day for Final along with backPortfix tracker one for the pending release, and one for previous stable releases.
Well, the thing is, the criteria are organized by milestone, and we hit this situation quite often at Beta: the upgrade criteria kick in at Beta, for instance. So if upgrade from F23 to F24 Beta is completely broken, but the fix has to go out as an F23 update, we should really be tracking that to make sure it does. If we only make sure the fix goes out by Final, are we really honouring the criteria properly?
I don't think it's appropriate to turn FEs into blockers automatically, in fact there are obvious cases where it certainly wouldn't be appropriate: bugs in non-blocking desktops are typically taken as FEs, for instance, as are bugs in secondary arches. Neither of those can ever be blockers by policy.
On 11/19/2015 11:40 PM, Adam Williamson wrote:
On Thu, 2015-11-19 at 19:09 +0530, Sudhir D wrote:
I suggest we have only one ZeroDay i.e., for Final and do away with intermediate ones. As I see it, ZeroDay comes with cost and we also need to have basic sanity testcases automated to ensure ZeroDay fixes won't introduce/regress blocker.
How about automatically qualifying any freeze exception in current phase as blocker for next phase and keep 0day only for RC? AlphaBlocker --> AlphaFreezeException --> BetaBlocker --> BetaFreezeException --> FinalBlocker --> FinalBlockerException --> ZeroDay
This would mean we will be not so liberal in allowing blockers linger around in a phase for more time, but I think that is okay tradeoff.
From tracking perspective, I think we may just want to have trackers for phaseBlocker for each milestone and FinalBlocker and 0Day for Final along with backPortfix tracker one for the pending release, and one for previous stable releases.
Well, the thing is, the criteria are organized by milestone, and we hit this situation quite often at Beta: the upgrade criteria kick in at Beta, for instance. So if upgrade from F23 to F24 Beta is completely broken, but the fix has to go out as an F23 update, we should really be tracking that to make sure it does. If we only make sure the fix goes out by Final, are we really honouring the criteria properly?
If a blocker bug breaks phase criteria, then there is no phase exit unless the bug is fixed. Unless we are ready to risk as it might happen in certain cases earlier in cycle but such instances should be zero once we are in Beta. That way, we would still be honoring the phase exit criteria. As a definition, BlockerExceptions should not contain any phase exit criteria bugs; these can be related to an important feature which is partially broken. For the Final phase though, all identified blockers and blockerExceptions that were carried from earlier phase are fixed before GA and if there is any exception in this phase out of that list (after risk assessment), we can consider them for 0day.
I don't think it's appropriate to turn FEs into blockers automatically, in fact there are obvious cases where it certainly wouldn't be appropriate: bugs in non-blocking desktops are typically taken as FEs, for instance, as are bugs in secondary arches. Neither of those can ever be blockers by policy.
Ok. We should probably stop calling them as FEs in that case :) and have a mechanism to track them on basis of priority and have them fixed before RC.
Regards, Sudhir
Regards, Sudhir
On Fri, 2015-11-20 at 16:18 +0530, Sudhir D wrote:
I don't think it's appropriate to turn FEs into blockers automatically, in fact there are obvious cases where it certainly wouldn't be appropriate: bugs in non-blocking desktops are typically taken as FEs, for instance, as are bugs in secondary arches. Neither of those can ever be blockers by policy.
Ok. We should probably stop calling them as FEs in that case :) and have a mechanism to track them on basis of priority and have them fixed before RC.
So just to be clear on the terms here: in Fedora we have 'blocker' and 'freeze exception' bugs. A 'blocker' bug must be fixed for the release to go ahead. A 'freeze exception' bug doesn't *have* to be fixed, but we will accept a fix during the milestone freeze period.
Sudhir explained his broader meaning to me on the phone, and it's a good point: so long as we don't actually have any process/mechanism for ensuring that 'special' blockers are fixed on time, calling them blockers is really misleading, because we aren't really having them 'block' the release. You're certainly right about that. At Monday's meeting, though, we agreed that we're generally in favour of changing the release process such that we *do* effectively block the release for these bugs, so if we do go ahead and implement that, it will still be correct to refer to them as 'blockers'.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/18/2015 07:36 PM, Adam Williamson wrote:
With either approach, the basic goal is to make it more feasible to keep an eye on each of the different categories of 'release blocker' bugs; right now we have solid processes in place for ensuring the 'normal' blockers are all addressed in the release media, but we don't have any processes in place for ensuring 0Day and Stable bugs actually get updates shipped when we say they must.
My suggestion would be that we make sure 'blockerbugs' includes lists of each type of blocker. Ahead of and at Go/No-Go meetings, we would want to have a formal assurance from the person responsible for fixing the bug that the fix would be provided by a certain time - say, one day or two days ahead of the release date - and it would be QA's responsibility to ensure the updates are tested promptly, and releng's responsibility to ensure they are pushed on time after being tested. I would suggest the Program Manager ought to have overall responsibility for keeping an eye on the 0Day and Stable blocker lists and making sure the maintainer, QA, and releng all did their jobs on time.
The biggest issue is this, I think. We probably need to encode "Special Blockers" into the Go/No-Go process. I don't think that assurance that it will be fixed on time is necessarily good enough. Particularly given the time that it takes stable updates to make it to the mirrors, I'd say that we probably want to say that any such special blockers have to be queued for stable before the Go/No-Go decision is made. (This may in some cases mean *during* the Go/No-Go meeting, of course.)
Stephen Gallagher píše v Čt 19. 11. 2015 v 15:50 -0500:
On 11/18/2015 07:36 PM, Adam Williamson wrote:
With either approach, the basic goal is to make it more feasible to keep an eye on each of the different categories of 'release blocker' bugs; right now we have solid processes in place for ensuring the 'normal' blockers are all addressed in the release media, but we don't have any processes in place for ensuring 0Day and Stable bugs actually get updates shipped when we say they must.
My suggestion would be that we make sure 'blockerbugs' includes lists of each type of blocker. Ahead of and at Go/No-Go meetings, we would want to have a formal assurance from the person responsible for fixing the bug that the fix would be provided by a certain time - say, one day or two days ahead of the release date - and it would be QA's responsibility to ensure the updates are tested promptly, and releng's responsibility to ensure they are pushed on time after being tested. I would suggest the Program Manager ought to have overall responsibility for keeping an eye on the 0Day and Stable blocker lists and making sure the maintainer, QA, and releng all did their jobs on time.
The biggest issue is this, I think. We probably need to encode "Special Blockers" into the Go/No-Go process. I don't think that assurance that it will be fixed on time is necessarily good enough. Particularly given the time that it takes stable updates to make it to the mirrors, I'd say that we probably want to say that any such special blockers have to be queued for stable before the Go/No-Go decision is made. (This may in some cases mean *during* the Go/No-Go meeting, of course.)
+1. I think that even 0day bugs should block the release. When the update for 0day bug would be available after Go/No-Go and we would found that it is buggy (doesn't fix bug properly or it creates another problems) then we would end up with unresolved blocker in release time. Maybe we could add some rule for delaying release in such cases (something like big red button - stop release now!).
From those two proposed solutions I like the second more. To have
another magic words for White board. As I said I think that 0day blockers should still block the release. But I agree that we should treat them differently so we should track them somehow.
-- test mailing list test@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/test
My suggestion would be that we make sure 'blockerbugs' includes lists of each type of blocker. Ahead of and at Go/No-Go meetings, we would want to have a formal assurance from the person responsible for fixing the bug that the fix would be provided by a certain time - say, one day or two days ahead of the release date - and it would be QA's responsibility to ensure the updates are tested promptly, and releng's responsibility to ensure they are pushed on time after being tested. I would suggest the Program Manager ought to have overall responsibility for keeping an eye on the 0Day and Stable blocker lists and making sure the maintainer, QA, and releng all did their jobs on time.
The biggest issue is this, I think. We probably need to encode "Special Blockers" into the Go/No-Go process. I don't think that assurance that it will be fixed on time is necessarily good enough. Particularly given the time that it takes stable updates to make it to the mirrors, I'd say that we probably want to say that any such special blockers have to be queued for stable before the Go/No-Go decision is made. (This may in some cases mean *during* the Go/No-Go meeting, of course.)
Well, here's our latest mess-up: https://bodhi.fedoraproject.org/updates/FEDORA-2015-e00b75e39f dnf-plugin-system-upgrade-0.7.0-1.fc22 had enough karma for stable on Oct 29, which was Go/No-Go day. Therefore it was considered "resolved". However, it was pushed to testing on Nov 2 (4 days later) and to stable on Nov 5 (5 days later!), which was the public release day. Since mirrormanager is configured to serve even last-but-one metadata (i.e. even 1-2 days old, relengs can provide a more precise value), many of our users upgraded on Nov 5 and Nov 6 using an older version of system-upgrade which broke their systems. Just read the comments: https://fedoramagazine.org/upgrading-from-fedora-22-to-fedora-23/#comments I was very unhappy. We solved most of the issues, it was a lot of work, and yet a large group of people was hit by those old, long-resolved problems, just because of bad timing and slow repo pushes (for whatever reason).
So, that update was "queued for stable before the Go/No-Go" as you proposed, and yet we have failed to deliver it. So if we really want to avoid such problems in the future, we either need to insist on "pushed to stable by Go/No-Go, no exceptions", or we need to have another check on release day and verify that all required builds were pushed to stable at least 2 days before -- if not, do not announce the release and wait for more days. The first approach is slightly impractical (we don't want to wait another week, it might be resolved in 2 more days; do we lift final freeze or not?), the second approach is confusing for media (media announce we're Go, and then nothing happens on the proclaimed release day).
What I see as a potential solution here is decoupling tasks that need to wait for the 0day blockers and those which don't. So, at the Go/No-Go meeting, we can decide that it is No-Go in general, but composes are final now and can be uploaded to proper locations for mirrors to pick them up. I don't know exactly what else relengs need to do, but I guess there will be other tasks that can be done. And in 2-3 days, we can have Go/No-Go again, where we decide that even 0day blocker have been addressed, pushed to stable, and we can pronounce the whole release Go, and publish the announcement immediately or the next day or whatever's appropriate (bearing in mind that there should be 2 days period after the 0day blockers are pushed stable).
WDYT? Reasonable? Complicated? Bonkers? Off the mark?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/20/2015 07:16 AM, Kamil Paral wrote:
The biggest issue is this, I think. We probably need to encode "Special Blockers" into the Go/No-Go process. I don't think that assurance that it will be fixed on time is necessarily good enough. Particularly given the time that it takes stable updates to make it to the mirrors, I'd say that we probably want to say that any such special blockers have to be queued for stable before the Go/No-Go decision is made. (This may in some cases mean *during* the Go/No-Go meeting, of course.)
Well, here's our latest mess-up: https://bodhi.fedoraproject.org/updates/FEDORA-2015-e00b75e39f dnf-plugin-system-upgrade-0.7.0-1.fc22 had enough karma for stable on
Oct 29, which was Go/No-Go day. Therefore it was considered "resolved".
"Had enough karma" != queued for stable. When I say "queued for stable", I mean that it needs to be "submitted for stable" and awaiting a push (if not already pushed). According to the history on that bug, it was not actually submitted for stable until November 2nd. That would have failed my criterion above, since that was after Go/No-Go.
On 11/20/2015 03:56 PM, Stephen Gallagher wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/20/2015 07:16 AM, Kamil Paral wrote:
The biggest issue is this, I think. We probably need to encode "Special Blockers" into the Go/No-Go process. I don't think that assurance that it will be fixed on time is necessarily good enough. Particularly given the time that it takes stable updates to make it to the mirrors, I'd say that we probably want to say that any such special blockers have to be queued for stable before the Go/No-Go decision is made. (This may in some cases mean *during* the Go/No-Go meeting, of course.)
Well, here's our latest mess-up: https://bodhi.fedoraproject.org/updates/FEDORA-2015-e00b75e39f dnf-plugin-system-upgrade-0.7.0-1.fc22 had enough karma for stable on
Oct 29, which was Go/No-Go day. Therefore it was considered "resolved".
"Had enough karma" != queued for stable. When I say "queued for stable", I mean that it needs to be "submitted for stable" and awaiting a push (if not already pushed). According to the history on that bug, it was not actually submitted for stable until November 2nd. That would have failed my criterion above, since that was after Go/No-Go.
Yup, I think "queued for stable" is the right thing to require here. Releng always does a 0 day push; we just need to ensure during the blocker review process that things that need to be included in that push are actually queued for stable.
That should be enough for all practical purposes. I mean, releng's 0 day push may fail of course or take longer than expected, but I don't think that needs to be tracked with the blocker review process. Releng is going to be painfully aware if their pushes are failing anyway and working as fast as they can to fix them.
Well, here's our latest mess-up: https://bodhi.fedoraproject.org/updates/FEDORA-2015-e00b75e39f dnf-plugin-system-upgrade-0.7.0-1.fc22 had enough karma for stable on
Oct 29, which was Go/No-Go day. Therefore it was considered "resolved".
"Had enough karma" != queued for stable. When I say "queued for stable", I mean that it needs to be "submitted for stable" and awaiting a push (if not already pushed). According to the history on that bug, it was not actually submitted for stable until November 2nd. That would have failed my criterion above, since that was after Go/No-Go.
Hmm, however, that update had karma autopush set, and its threshold was reached. So what exactly is the difference between maintainer asking to push and bodhi autokarma asking to push? I assume they should behave exactly the same. Maybe autokarma push was not triggered, or delayed for some reason? We need to find the difference and define that condition more precisely, or fix a bug somewhere, else it's quite likely we'll have a similar problem in the future.
Yup, I think "queued for stable" is the right thing to require here. Releng always does a 0 day push; we just need to ensure during the blocker review process that things that need to be included in that push are actually queued for stable.
That should be enough for all practical purposes. I mean, releng's 0 day push may fail of course or take longer than expected, but I don't think that needs to be tracked with the blocker review process. Releng is going to be painfully aware if their pushes are failing anyway and working as fast as they can to fix them.
OK. I was just trying to point out that there needs to be about 1-2 day period (releng knows best) between the 0 day push and the actual announcement. Maybe that's why we usually have 4 days between Go and the announcement? I don't know whether there's a releng process for it or not, but since QA wants to track 0 day updates more reliably, it would make sense to also have a similar process in releng space to ensure the updates are ready for everyone when the announcement goes out. I'd like to avoid the system-upgrade sad story next time.
OK. I was just trying to point out that there needs to be about 1-2 day period (releng knows best) between the 0 day push and the actual announcement. Maybe that's why we usually have 4 days between Go and the announcement? I don't know whether there's a releng process for it or not, but since QA wants to track 0 day updates more reliably, it would make sense to also have a similar process in releng space to ensure the updates are ready for everyone when the announcement goes out. I'd like to avoid the system-upgrade sad story next time.
I've tried to find out some of the technical details of this. Mirrormanager publishes the current hash of repomd.xml, and also hashes of usually up to two older repomd.xml files. You can see it here: https://mirrors.fedoraproject.org/metalink?repo=rawhide&arch=x86_64 It's the <hash> tag in <file> and <alternates> tags.
Here's a nice graph showing how often our mirrors distribute current, or older content: https://adrian.fedorapeople.org/f22-updates-repomd-propagation.svg
The time restraints are defined here: https://github.com/fedora-infra/mirrormanager2/blob/master/mirrormanager2/li... If the current push is older than 2 days, there should be no alternate hashes older than 3 days. If the current push is younger, there can be one hash arbitrarily old, but no further hashes older than 3 days. I hope I read the code correctly (the docstring doesn't seem to match it exactly).
However, it also depends on how often metalink is regenerated, the old items will not disappear on their own. I learned that all metalinks are regenerated based on any of these fedmsg events: org.fedoraproject.prod.bodhi.updates.fedora.sync org.fedoraproject.prod.compose.rawhide.rsync.complete org.fedoraproject.prod.compose.branched.rsync.complete So if there is no new push (in any repository), metalinks are not regenerated and old hashes are not dropped. Theoretically releng could send out one of those fedmsg events artificially to trigger metalink regeneration, if needed.
Currently, there are no means to generate a new metalink with alternative hashes disabled, or removing those alternatives from the metalink intentionally at some point of time afterwards. That would require patching our tools. This would of course lead to a larger load on our master mirror and those mirrors which managed to get synced quickly, because that would disqualify any other mirrors which are not completely current. But it could get handy in some situations, unfortunately the tools do not allow it at the moment.
The second part of the story is dnf. In dnf, metadata_expire= option defines how often metalink is pulled again and new metadata are downloaded if the cached metadata hash differs from the current hash in the metalink. However, if the top-listed repository is not completely up-to-date (it contains current-1 or current-2 metadata), but its hash is listed among alternate hashes in the metalink, dnf is fine with that and does not attempt to query different repos to retrieve the very current metadata. That means that as long as the metalink contains some older hashes, and some repository offers that older metadata, some users might not get latest metadata. The default value for metadata_expire is 6 hours for stable updates.
So, the outcome of this exercise is: If we want to be sure the latest updates are available to _all_ our users, we need to wait until there are no older metadata hashes in the metalink and then 6 more hours. There will be no older metadata hashes in the metalink when 3 days pass since the push of the important update, *once* there is a new push after that time (which will regenerate the metalink), or if releng send out a fake event manually.
This is, uh, a) quite a long time and b) complex. I'll be very glad if you can point out anything that I've described or understood wrong.
I've tried to find out some of the technical details of this. Mirrormanager publishes the current hash of repomd.xml, and also hashes of usually up to two older repomd.xml files. You can see it here: https://mirrors.fedoraproject.org/metalink?repo=rawhide&arch=x86_64 It's the <hash> tag in <file> and <alternates> tags.
Here's a nice graph showing how often our mirrors distribute current, or older content: https://adrian.fedorapeople.org/f22-updates-repomd-propagation.svg
The time restraints are defined here: https://github.com/fedora-infra/mirrormanager2/blob/master/mirrormanager2/li... If the current push is older than 2 days, there should be no alternate hashes older than 3 days. If the current push is younger, there can be one hash arbitrarily old, but no further hashes older than 3 days. I hope I read the code correctly (the docstring doesn't seem to match it exactly).
However, it also depends on how often metalink is regenerated, the old items will not disappear on their own. I learned that all metalinks are regenerated based on any of these fedmsg events: org.fedoraproject.prod.bodhi.updates.fedora.sync org.fedoraproject.prod.compose.rawhide.rsync.complete org.fedoraproject.prod.compose.branched.rsync.complete So if there is no new push (in any repository), metalinks are not regenerated and old hashes are not dropped. Theoretically releng could send out one of those fedmsg events artificially to trigger metalink regeneration, if needed.
Currently, there are no means to generate a new metalink with alternative hashes disabled, or removing those alternatives from the metalink intentionally at some point of time afterwards. That would require patching our tools. This would of course lead to a larger load on our master mirror and those mirrors which managed to get synced quickly, because that would disqualify any other mirrors which are not completely current. But it could get handy in some situations, unfortunately the tools do not allow it at the moment.
The second part of the story is dnf. In dnf, metadata_expire= option defines how often metalink is pulled again and new metadata are downloaded if the cached metadata hash differs from the current hash in the metalink. However, if the top-listed repository is not completely up-to-date (it contains current-1 or current-2 metadata), but its hash is listed among alternate hashes in the metalink, dnf is fine with that and does not attempt to query different repos to retrieve the very current metadata. That means that as long as the metalink contains some older hashes, and some repository offers that older metadata, some users might not get latest metadata. The default value for metadata_expire is 6 hours for stable updates.
So, the outcome of this exercise is: If we want to be sure the latest updates are available to _all_ our users, we need to wait until there are no older metadata hashes in the metalink and then 6 more hours. There will be no older metadata hashes in the metalink when 3 days pass since the push of the important update, *once* there is a new push after that time (which will regenerate the metalink), or if releng send out a fake event manually.
This is, uh, a) quite a long time and b) complex. I'll be very glad if you can point out anything that I've described or understood wrong.
Taking all of this into account, would this be a reasonable idea? 1. At Go/No-Go voting time, all updates which block F-N release but belong to F-M (M<N) release, must be already pushed stable. If this is not the case and it's the last blocking issue, selected tasks (like copying compose trees into appropriate places) can be performed, and Go/No-Go will be rescheduled to the day and time when it is expected that those updates will have been pushed. 2. We will create a new mirrormanager script which will go through the specified metalink(s) and remove all metadata hashes which are older than provided timestamp/hash. 3. If there are such updates as mentioned in point 1., RelEng will use this script to remove old metadata alternatives from the metalink, which means only metadata from the day this update was pushed or newer will be kept. In order to not increase mirror strain too much, this doesn't need to be used immediately, but just shortly before the release announcement (so that mirrors have time to sync latest packages, and the user load is distributed among more mirrors including those with current-1 or current-2 trees as long as possible). 4. Once the script is run in point 3., we can post the release announcement in 6 hours.
I know there still one manual step involved (figuring out in which push the blocker update was included), but I don't know how to better solve it, especially if we don't want to wait for too long.
I would be interested in Infra/RelEng feedback for the technical part of this (CCing Kevin and Dennis). Do you think this is reasonable solution, or am I completely off the track here? Do you see any better options?
Thanks, Kamil
On Tue, 1 Dec 2015 07:21:04 -0500 (EST) Kamil Paral kparal@redhat.com wrote:
Taking all of this into account, would this be a reasonable idea?
- At Go/No-Go voting time, all updates which block F-N release but
belong to F-M (M<N) release, must be already pushed stable. If this is not the case and it's the last blocking issue, selected tasks (like copying compose trees into appropriate places) can be performed, and Go/No-Go will be rescheduled to the day and time when it is expected that those updates will have been pushed.
I think thats not a great idea. It gets back to why we only slip in one week increments. If say we have a go/no-go on a thursday and the only thing blocking it is some update thats not pushed stable all the way yet, we reschedule for friday and if it's not done then we schedule for saturday? This means everyone has to work extra hours without even being sure when the release will be. Leaves less time to sync mirrors, update common bugs, etc etc.
So, the alternative there would be to slip a week to get it pushed, but some people may find that excessive.
- We will
create a new mirrormanager script which will go through the specified metalink(s) and remove all metadata hashes which are older than provided timestamp/hash.
Something like that should be pretty easy to do I would think. (Although I am not a mm developer)
- If there are such updates as mentioned in
point 1., RelEng will use this script to remove old metadata alternatives from the metalink, which means only metadata from the day this update was pushed or newer will be kept. In order to not increase mirror strain too much, this doesn't need to be used immediately, but just shortly before the release announcement (so that mirrors have time to sync latest packages, and the user load is distributed among more mirrors including those with current-1 or current-2 trees as long as possible). 4. Once the script is run in point 3., we can post the release announcement in 6 hours.
I know there still one manual step involved (figuring out in which push the blocker update was included), but I don't know how to better solve it, especially if we don't want to wait for too long.
I would be interested in Infra/RelEng feedback for the technical part of this (CCing Kevin and Dennis). Do you think this is reasonable solution, or am I completely off the track here? Do you see any better options?
So, looking back, we had the case of that dnf-system-upgrade. Are there any others in the past, or are we making a bigger than life deal out of one case?
Also, that case could have been solved by dropping the alternates in metalink as you suggest above at 2 right?
One thing that perhaps we could improve is to somehow note these sorts of things to releng. I just checked irc logs and I didn't see any mention of that dnf-system-upgrade plugin update being important until nov 3rd. Would a tracker ticket help this?
kevin
On Tue, 2015-12-01 at 19:40 -0700, Kevin Fenzi wrote:
One thing that perhaps we could improve is to somehow note these sorts of things to releng. I just checked irc logs and I didn't see any mention of that dnf-system-upgrade plugin update being important until nov 3rd. Would a tracker ticket help this?
We're already working on that side of things, see elsewhere in the thread.
Taking all of this into account, would this be a reasonable idea?
- At Go/No-Go voting time, all updates which block F-N release but
belong to F-M (M<N) release, must be already pushed stable. If this is not the case and it's the last blocking issue, selected tasks (like copying compose trees into appropriate places) can be performed, and Go/No-Go will be rescheduled to the day and time when it is expected that those updates will have been pushed.
I think thats not a great idea. It gets back to why we only slip in one week increments. If say we have a go/no-go on a thursday and the only thing blocking it is some update thats not pushed stable all the way yet, we reschedule for friday and if it's not done then we schedule for saturday? This means everyone has to work extra hours without even being sure when the release will be.
If the update is pending stable and just not pushed, it might sense to move it one day, yes (most probably skipping weekends, though). If it needs more testing, we might decide to postpone it a several days. If it's not available at all yet, waiting an extra week might be the right choice. So it would depend on the situation and best guess of folks at Go/No-Go.
Leaves less time to sync mirrors, update common bugs, etc etc.
I would say the opposite - all of that can start happening right away, it's not blocked on waiting for the FN-x push. So in case the announcement gets out on Tuesday as usual, it's the same time, but if it gets pushed back to Wednesday or later day, it's more time for these tasks to happen. The only exception is that FN-x updates repo, which will get shorter sync time because we want to make sure people download the fixed packages, not old ones. Currently that behavior is undefined.
So, the alternative there would be to slip a week to get it pushed, but some people may find that excessive.
That's why I wanted to propose something more flexible, but hey, it's just an idea.
- We will
create a new mirrormanager script which will go through the specified metalink(s) and remove all metadata hashes which are older than provided timestamp/hash.
Something like that should be pretty easy to do I would think. (Although I am not a mm developer)
Looking into existing MM scripts, I have the same opinion, but I can contact Adrian to confirm. If we want to make it even simpler, we can drop all alternative metadata and leave just the current hash (that script would be run once the push containing that critical update is performed).
- If there are such updates as mentioned in
point 1., RelEng will use this script to remove old metadata alternatives from the metalink, which means only metadata from the day this update was pushed or newer will be kept. In order to not increase mirror strain too much, this doesn't need to be used immediately, but just shortly before the release announcement (so that mirrors have time to sync latest packages, and the user load is distributed among more mirrors including those with current-1 or current-2 trees as long as possible). 4. Once the script is run in point 3., we can post the release announcement in 6 hours.
I know there still one manual step involved (figuring out in which push the blocker update was included), but I don't know how to better solve it, especially if we don't want to wait for too long.
I would be interested in Infra/RelEng feedback for the technical part of this (CCing Kevin and Dennis). Do you think this is reasonable solution, or am I completely off the track here? Do you see any better options?
So, looking back, we had the case of that dnf-system-upgrade. Are there any others in the past, or are we making a bigger than life deal out of one case?
I don't want to exaggerate the topic, but I'd also like to find and describe a process how we can avoid it next time. It will be needed twice a year at maximum.
I believe there were a few similar issues in the past, but I can't really point to any other examples. In majority of cases, this is likely to be related to system upgrade (system-upgrade, dnf, plymouth, systemd, gpg keys).
Also, that case could have been solved by dropping the alternates in metalink as you suggest above at 2 right?
Yes.
One thing that perhaps we could improve is to somehow note these sorts of things to releng. I just checked irc logs and I didn't see any mention of that dnf-system-upgrade plugin update being important until nov 3rd. Would a tracker ticket help this?
In the future, these issues should be tracked by blocker bugs app using bugzilla tracker and a specific keyword, so we should not lose track of this. But as mentioned, pushing to stable is not enough, we also need to make sure old content is not served to users. That's why the "dropping alternative metadata from metalink" idea. We can file a releng ticket for this, and either include a description of what needs to be done, or link to some wiki SOP. QA can take care of all of that. The only thing that we need to ensure is that it really is handled before the announcement goes live, so it needs to be listed somewhere in RelEng/Infra "new release" SOP.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 12/02/2015 06:42 AM, Kamil Paral wrote:
Taking all of this into account, would this be a reasonable idea? 1. At Go/No-Go voting time, all updates which block F-N release but belong to F-M (M<N) release, must be already pushed stable. If this is not the case and it's the last blocking issue, selected tasks (like copying compose trees into appropriate places) can be performed, and Go/No-Go will be rescheduled to the day and time when it is expected that those updates will have been pushed.
I think thats not a great idea. It gets back to why we only slip in one week increments. If say we have a go/no-go on a thursday and the only thing blocking it is some update thats not pushed stable all the way yet, we reschedule for friday and if it's not done then we schedule for saturday? This means everyone has to work extra hours without even being sure when the release will be.
If the update is pending stable and just not pushed, it might sense to move it one day, yes (most probably skipping weekends, though).
Sure, this I think is sane.
If
it needs more testing, we might decide to postpone it a several days. If it's not available at all yet, waiting an extra week might be the right choice. So it would depend on the situation and best guess of folks at Go/No-Go.
I think this shouldn't be conditional: if at Go/No-Go the update isn't at least ready to hit the button, then we slip a week. Period. "Waiting a couple days for testing" is just adding unnecessary uncertainty.
On Wed, 2 Dec 2015 06:42:09 -0500 (EST) Kamil Paral kparal@redhat.com wrote:
If the update is pending stable and just not pushed, it might sense to move it one day, yes (most probably skipping weekends, though). If it needs more testing, we might decide to postpone it a several days. If it's not available at all yet, waiting an extra week might be the right choice. So it would depend on the situation and best guess of folks at Go/No-Go.
But then no one knows whats happening. People could assume it's go and do a bunch of release prep, but find out it's not and then have wasted their time (or at least rushed when they had more time).
Leaves less time to sync mirrors, update common bugs, etc etc.
I would say the opposite - all of that can start happening right away, it's not blocked on waiting for the FN-x push. So in case the announcement gets out on Tuesday as usual, it's the same time, but if it gets pushed back to Wednesday or later day, it's more time for these tasks to happen. The only exception is that FN-x updates repo, which will get shorter sync time because we want to make sure people download the fixed packages, not old ones. Currently that behavior is undefined.
There's a bunch of things that coordinate for the release. IMHO shifting it a few days would make those more difficult. "Hey mirrors, here's f23 content, our release is... sometime next week, we don't have our crap together enough to tell you what day, but expect someday the bit will flip and you will get a bunch more traffic. Hope that helps you"
So, the alternative there would be to slip a week to get it pushed, but some people may find that excessive.
That's why I wanted to propose something more flexible, but hey, it's just an idea.
Sure. ;)
...snip...
In the future, these issues should be tracked by blocker bugs app using bugzilla tracker and a specific keyword, so we should not lose track of this. But as mentioned, pushing to stable is not enough, we also need to make sure old content is not served to users. That's why the "dropping alternative metadata from metalink" idea. We can file a releng ticket for this, and either include a description of what needs to be done, or link to some wiki SOP. QA can take care of all of that. The only thing that we need to ensure is that it really is handled before the announcement goes live, so it needs to be listed somewhere in RelEng/Infra "new release" SOP.
I definitely like the idea of tracking this better and making sure to tell releng whats needed. In fact, I think that may be all we need, especially with bodhi2 pushing updates faster than 1 did.
kevin
On Wednesday, December 02, 2015 06:42:09 AM Kamil Paral wrote:
Taking all of this into account, would this be a reasonable idea?
- At Go/No-Go voting time, all updates which block F-N release but
belong to F-M (M<N) release, must be already pushed stable. If this is not the case and it's the last blocking issue, selected tasks (like copying compose trees into appropriate places) can be performed, and Go/No-Go will be rescheduled to the day and time when it is expected that those updates will have been pushed.
I think thats not a great idea. It gets back to why we only slip in one week increments. If say we have a go/no-go on a thursday and the only thing blocking it is some update thats not pushed stable all the way yet, we reschedule for friday and if it's not done then we schedule for saturday? This means everyone has to work extra hours without even being sure when the release will be.
If the update is pending stable and just not pushed, it might sense to move it one day, yes (most probably skipping weekends, though). If it needs more testing, we might decide to postpone it a several days. If it's not available at all yet, waiting an extra week might be the right choice. So it would depend on the situation and best guess of folks at Go/No-Go.
I am with Kevin here, we have things tightly coupled with mirrors and mirroring, making changes by a day or two throws timings way off. purely because we have a built in sync buffer of the weekend. To slip the go/no go decision to Monday we would need to push out the ship date from Tuesday to Friday to give mirrors syncing time and that is making things somewhat tight. We really need to slip a week for any slip
Leaves less time to sync mirrors, update common bugs, etc etc.
I would say the opposite - all of that can start happening right away, it's not blocked on waiting for the FN-x push. So in case the announcement gets out on Tuesday as usual, it's the same time, but if it gets pushed back to Wednesday or later day, it's more time for these tasks to happen. The only exception is that FN-x updates repo, which will get shorter sync time because we want to make sure people download the fixed packages, not old ones. Currently that behavior is undefined.
we can not put the bits onto the mirrors until we are sure they are the bits, otherwise we offer the mirrors lots of churn, wasted iops and bandwidth and we lose mirrors.
So, the alternative there would be to slip a week to get it pushed, but some people may find that excessive.
That's why I wanted to propose something more flexible, but hey, it's just an idea.
In order to be more flexible here we would really need to change fundamentally how we push bits to the mirrors. If we had a CDN of our own under our control we would have more options available, but the cost of that would be massive.
- We will
create a new mirrormanager script which will go through the specified metalink(s) and remove all metadata hashes which are older than provided timestamp/hash.
Something like that should be pretty easy to do I would think. (Although I am not a mm developer)
Looking into existing MM scripts, I have the same opinion, but I can contact Adrian to confirm. If we want to make it even simpler, we can drop all alternative metadata and leave just the current hash (that script would be run once the push containing that critical update is performed).
I am okay with having a way to say ship only the latest metadata.
- If there are such updates as mentioned in
point 1., RelEng will use this script to remove old metadata alternatives from the metalink, which means only metadata from the day this update was pushed or newer will be kept. In order to not increase mirror strain too much, this doesn't need to be used immediately, but just shortly before the release announcement (so that mirrors have time to sync latest packages, and the user load is distributed among more mirrors including those with current-1 or current-2 trees as long as possible). 4. Once the script is run in point 3., we can post the release announcement in 6 hours.
I know there still one manual step involved (figuring out in which push the blocker update was included), but I don't know how to better solve it, especially if we don't want to wait for too long.
I would be interested in Infra/RelEng feedback for the technical part of this (CCing Kevin and Dennis). Do you think this is reasonable solution, or am I completely off the track here? Do you see any better options?
So, looking back, we had the case of that dnf-system-upgrade. Are there any others in the past, or are we making a bigger than life deal out of one case?
I don't want to exaggerate the topic, but I'd also like to find and describe a process how we can avoid it next time. It will be needed twice a year at maximum.
I believe there were a few similar issues in the past, but I can't really point to any other examples. In majority of cases, this is likely to be related to system upgrade (system-upgrade, dnf, plymouth, systemd, gpg keys).
there is the potential always of hitting issues. with upgrades. an older release gets a higher nvr and things get messy. It is not an issue just at release time.
Also, that case could have been solved by dropping the alternates in metalink as you suggest above at 2 right?
Yes.
One thing that perhaps we could improve is to somehow note these sorts of things to releng. I just checked irc logs and I didn't see any mention of that dnf-system-upgrade plugin update being important until nov 3rd. Would a tracker ticket help this?
In the future, these issues should be tracked by blocker bugs app using bugzilla tracker and a specific keyword, so we should not lose track of this. But as mentioned, pushing to stable is not enough, we also need to make sure old content is not served to users. That's why the "dropping alternative metadata from metalink" idea. We can file a releng ticket for this, and either include a description of what needs to be done, or link to some wiki SOP. QA can take care of all of that. The only thing that we need to ensure is that it really is handled before the announcement goes live, so it needs to be listed somewhere in RelEng/Infra "new release" SOP. --
we have no way of ensuring always that people are getting the latest data, or that they have the latest bits installed. but people can always shoot themselves in the foot. people can and will do a distro update without updating the running os first. I would suggest not filing a rel-eng ticket and telling us what to do as that will not go over well. We should now sit down and work out a process. then likely a ticket needs to be filed asking that the process be followed.
Dennis
On Fri, 2015-12-04 at 12:20 -0600, Dennis Gilmore wrote:
there is the potential always of hitting issues. with upgrades. an older release gets a higher nvr and things get messy. It is not an issue just at release time.
This is true, but release time is important, because we're very publicly visible, and a *lot* of people try to upgrade right around release time. It's a bad look when there's a major bug that breaks all or most upgrades at release time.
I definitely like the idea of tracking this better and making sure to tell releng whats needed. In fact, I think that may be all we need, especially with bodhi2 pushing updates faster than 1 did.
I see 3 parts here: 1. tracking these issues (that will be done by QA with our new trackers) 2. knowing what needs to be done (that I'm trying to figure out in this thread), and implementing necessary tools if needed 3. making sure it's done (it needs to be part of some QA or RelEng SOP to make sure it's not forgotten when pushing out the new release)
I am with Kevin here, we have things tightly coupled with mirrors and mirroring, making changes by a day or two throws timings way off. purely because we have a built in sync buffer of the weekend. To slip the go/no go decision to Monday we would need to push out the ship date from Tuesday to Friday to give mirrors syncing time and that is making things somewhat tight. We really need to slip a week for any slip
OK, a week slip it is.
Leaves less time to sync mirrors, update common bugs, etc etc.
I would say the opposite - all of that can start happening right away, it's not blocked on waiting for the FN-x push. So in case the announcement gets out on Tuesday as usual, it's the same time, but if it gets pushed back to Wednesday or later day, it's more time for these tasks to happen. The only exception is that FN-x updates repo, which will get shorter sync time because we want to make sure people download the fixed packages, not old ones. Currently that behavior is undefined.
we can not put the bits onto the mirrors until we are sure they are the bits, otherwise we offer the mirrors lots of churn, wasted iops and bandwidth and we lose mirrors.
I think there's still some misunderstanding here, but I don't want to spend more time on this. My core topic is to figure out what steps we need to take in order to deliver the updated (fixed) packages in FN-1/FN-2 repos to all our users on the FN release day. Slip duration is secondary.
Looking into existing MM scripts, I have the same opinion, but I can contact Adrian to confirm. If we want to make it even simpler, we can drop all alternative metadata and leave just the current hash (that script would be run once the push containing that critical update is performed).
I am okay with having a way to say ship only the latest metadata.
Great, so this could be the technical means to do what we're looking for. Now I need to discuss this with MirrorManager devs and ideally convince them to implement this.
we have no way of ensuring always that people are getting the latest data, or that they have the latest bits installed. but people can always shoot themselves in the foot. people can and will do a distro update without updating the running os first.
Yes, they can. We're trying to limit the unintentional foot shooting in this thread. I.e. once we tell them it's safe to upgrade ("Fedora N+1 is here!"), it should really be safe to upgrade.
I would suggest not filing a rel-eng ticket and telling us what to do as that will not go over well. We should now sit down and work out a process. then likely a ticket needs to be filed asking that the process be followed.
That's exactly what I was trying to say, in a different words. I hope that means we agree with each other :-) This very thread is my attempt to "sit down and work out a process", and once we have it, and we implement the tool to do the necessary tasks, RelEng's "New Release SOP" can then include something like: "If there are any accepted blockers for previous stable releases [link], please look at the date when the last of them was pushed stable, and run `./mirror-manager-drop-older-metadata REPO DATE`" Or QA can ask you to please not forget about this step, if that's what you prefer.
Does that sound OK?
#2 MOAR METADATA
The alternative is to make the existing Blocker trackers do more work. In this model we wouldn't add any new tracker bugs; we'd just add new 'magic words' in the Whiteboard field. Right now, an accepted blocker is identified by the string 'AcceptedBlocker' appearing in the whiteboard field. We could simply add some more magical strings like that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions welcome).
I'd use this approach, distinguish by whiteboard metadata. It's easier for people to propose (you need to remember just 2 different tracker aliases), and it should not require too many blockerbugs modifications (the Propose page can stay exactly the same, and in the list view which just add some column or tag to distinguish different blocker types).
On Wed, 18 Nov 2015 16:36:16 -0800 Adam Williamson adamwill@fedoraproject.org wrote:
<snip>
#2 MOAR METADATA
The alternative is to make the existing Blocker trackers do more work. In this model we wouldn't add any new tracker bugs; we'd just add new 'magic words' in the Whiteboard field. Right now, an accepted blocker is identified by the string 'AcceptedBlocker' appearing in the whiteboard field. We could simply add some more magical strings like that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions welcome).
I kind of like this idea as it's less change and involves creating fewer new bugs. We'd have to make some changes to blockerbugs either way - tflink can say if either approach would be more work in blockerbugs, but I'm gonna guess they'd be fairly similar.
Yeah, I think that either approach would involve a similar amount of effort to change blockerbugs. #2 would be slightly less effort but it's not much.
Once the approach is decided, we can file RFEs for blockerbugs and get that work landed before F24 alpha.
Tim
On Wed, 2015-11-18 at 16:36 -0800, Adam Williamson wrote:
#2 MOAR METADATA
The alternative is to make the existing Blocker trackers do more work. In this model we wouldn't add any new tracker bugs; we'd just add new 'magic words' in the Whiteboard field. Right now, an accepted blocker is identified by the string 'AcceptedBlocker' appearing in the whiteboard field. We could simply add some more magical strings like that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions welcome).
I kind of like this idea as it's less change and involves creating fewer new bugs. We'd have to make some changes to blockerbugs either way - tflink can say if either approach would be more work in blockerbugs, but I'm gonna guess they'd be fairly similar.
Hi again folks!
So it sounds like this option was preferred by everyone who expressed a preference, and it's my choice too, so I figure we should just go for it.
I think we still have some more research/discussion/co-ordination to do before we can propose changes to the release process (especially the go/no-go process) to 'enforce' special blockers, but I think we can go ahead and implement the *tracking* side of the changes now. So I'm gonna propose that we add these new terms for the Whiteboard field:
Accepted0Day (for bugs where the fix must appear in 0-day updates for the new release) AcceptedStable (for bugs where the fix must appear in updates for the previous stable release(s) by release day of the new release)
I'm not 100% married to either of those, especially the second. If anyone has a better idea, please send it!
Once we decide on the terms, the next step would be to edit the blocker SOPs:
https://fedoraproject.org/wiki/QA:SOP_Blocker_Bug_Meeting https://fedoraproject.org/wiki/QA:SOP_blocker_bug_process
- the changes shouldn't be too onerous - and to update blockerbugs for the new world order. I know tflink has a lot on his plate, so I might take a cut at that to try and save him the work.
Comments, thoughts, questions as always - thanks folks!
Accepted0Day (for bugs where the fix must appear in 0-day updates for the new release) AcceptedStable (for bugs where the fix must appear in updates for the previous stable release(s) by release day of the new release)
I'm not 100% married to either of those, especially the second. If anyone has a better idea, please send it!
It works, but I'm not completely satisfied with 'AcceptedStable' either, there are too many different interpretations. Potential other keywords could be 'AcceptedPrevious' or 'AcceptedPrevRelease'. The last one is the longest, but it's also most obvious. It would probably be my choice, if we didn't come up with something even better.
Up top I'd like to note there are really kind of two buckets of 'special blockers' for any given release. If the release being validated is N, they are:
- Bugs for which the fix must be in the 0-day update set for N
- Bugs for which the fix must be stable in N-1 and N-2 by N release day
Hello folks,
I'd like to return to the high-level overview for this topic and discuss the changes we plan to do in our SOPs.
So far, we decided to call bugs from bucket 1) as Accepted0Day and bugs from bucket 2) as AcceptedPreviousRelease. I also worked on some technical details for ensuring AcceptedPreviousRelease updates get pushed on time. Now we need to discuss what *happens* when we have one of these two bugs.
== Question #1: Do we slip always? ==
With media blockers, we need to create new media, which ensures a slip (there were a few exceptional situations in the past where we managed to build and test fixed media in a day, and therefore postponed the Go/NoGo decision for a day). With non-media blockers, the affected artifact is either the repository tree (we need to push a new build for Accepted0Day), or a previous release repository tree (we need to push an update for AcceptedPreviousRelease). For Accepted0Day, this will most likely involve critical bugs in components which are not on the default installation media (but for example negatively influence them, or prevent some other important functionality). For AcceptedPreviousRelease, this will most likely involve bugs in upgrading the system, or a few other specific cases like creating a bootable media of the new release or booting the new release in a VM.
Now the question is whether exactly the same rules apply (i.e. if this is not fixed at go/no-go, we slip), or whether in certain cases we would decide to not slip.
Since pushing an update can be done relatively fast, I can imagine that people would propose to not slip if an update is prepared and tested, but not yet pushed stable. Earlier in this thread, I tried to point out that this is not enough, because things can go wrong on multiple levels and we really need to insist that update is already stable (and metalink metadata adjusted to not allow usage of older pushes). Of course this can be handled with the same trick as media blockers sometimes, i.e. postponing the Go/NoGo decision for a day, provided RelEng approves. But in general I think we should not avoid slipping and just "hope for the best". These bugs were accepted as blockers and we need to make sure people don't hit them, even if they have a bad luck of being assigned to an older mirror.
Do you see any other cases where we should either not slip, or it would be tempting to not slip and we should discuss and define such use case explicitly?
== Question #2: For how long do we slip? ==
Earlier in this thread, I suggested some kind of a dynamic slip that would reflect how fast we can resolve things (for example perform a push). But both Kevin from Infra and Dennis from RelEng didn't think it was a good idea, and claimed we should slip as usual, i.e. a week (if I misunderstood something, please correct me). Of course this is their field, not QA's, so I definitely believe their judgment.
Do you have some other ideas/proposals, in general or in some specific situations regarding the slip length?
Thanks a lot for feedback, Kamil
On 01/12/2016 08:37 AM, Kamil Paral wrote:
Do you have some other ideas/proposals, in general or in some specific situations regarding the slip length?
I'm wondering if there would be interest in hosting a file containing upgrade requirements for each version. For example it could have the package version requirements needed for a successful upgrade. The upgrade tool could check that and warn the user.
Do you have some other ideas/proposals, in general or in some specific situations regarding the slip length?
I'm wondering if there would be interest in hosting a file containing upgrade requirements for each version. For example it could have the package version requirements needed for a successful upgrade. The upgrade tool could check that and warn the user.
From the blocker bug process perspective, I think this doesn't change much. Instead of ensuring the updated build is pushed to FN/FN-1/FN-2, we would need to ensure this requirements file is updated and pushed (probably as part of the system-upgrade package). So, the same process for us, really.
This would allow us to provide a better experience for important bugs which were not approved as blockers, though. If this definition file contained a package version that is not found (this or later version) while computing the system upgrade, system-upgrade could visibly warn the user that this and this bug still affects the upgrade process and reviewing those is recommended before commencing. Basically this is the same we do in Common Bugs [1], but visible to every user, not just those who stumble upon Common Bugs. I'd like that. But this case is not really related to the blocker bug process and this particular proposal and should be suggested as a feature request to system-upgrade developers. (I'm somewhat skeptical that Will will want to maintain this functionality, even though it would be nice for users. But maybe system-upgrade could simply suggest users to look at the Common Bugs page? That sounds simple enough and maintenance-free.).
So overall a nice idea, but I think it does not directly affect any of those two decisions we need to make. But maybe I missed something :)
[1] https://fedoraproject.org/wiki/Common_F23_bugs#Upgrade_issues
On Tue, Jan 12, 2016 at 6:06 PM, Samuel Sieb samuel@sieb.net wrote:
On 01/12/2016 08:37 AM, Kamil Paral wrote:
Do you have some other ideas/proposals, in general or in some specific situations regarding the slip length?
I'm wondering if there would be interest in hosting a file containing upgrade requirements for each version. For example it could have the package version requirements needed for a successful upgrade. The upgrade tool could check that and warn the user.
One of my concerns is the state of ca-legacy, and whether and how this gets disabled on upgrades. I'm sure there are some other things that have ex post facto unsafe defaults that just stick around through upgrades rather than being reset to new defaults. In my opinion that would violate the Workstation PRD "Upgrading the system multiple times through the upgrade process should give a result that is the same as an original install of Fedora Workstation."
On Wed, 2016-01-13 at 10:13 -0700, Chris Murphy wrote:
On Tue, Jan 12, 2016 at 6:06 PM, Samuel Sieb samuel@sieb.net wrote:
On 01/12/2016 08:37 AM, Kamil Paral wrote:
Do you have some other ideas/proposals, in general or in some specific situations regarding the slip length?
I'm wondering if there would be interest in hosting a file containing upgrade requirements for each version. For example it could have the package version requirements needed for a successful upgrade. The upgrade tool could check that and warn the user.
One of my concerns is the state of ca-legacy, and whether and how this gets disabled on upgrades. I'm sure there are some other things that have ex post facto unsafe defaults that just stick around through upgrades rather than being reset to new defaults. In my opinion that would violate the Workstation PRD "Upgrading the system multiple times through the upgrade process should give a result that is the same as an original install of Fedora Workstation."
This all seems out of scope, as Kamil said. Can we please stick to the non-media blocker policy discussion here? General concerns / ideas for upgrades, and specific potential upgrade issues, should get their own threads.
It's very likely true that upgraded systems get increasingly out of whack with freshly installed ones when it comes to default configurations of various packages - especially ones which don't use the modular, multiply-overridden configuration style, and thus can't easily update the distribution defaults post-install - but this doesn't really seem to have much to do with the question of what the release process policies WRT non-media blockers should be.
On Wed, Jan 13, 2016 at 2:55 PM, Adam Williamson adamwill@fedoraproject.org wrote:
On Wed, 2016-01-13 at 10:13 -0700, Chris Murphy wrote:
On Tue, Jan 12, 2016 at 6:06 PM, Samuel Sieb samuel@sieb.net wrote:
On 01/12/2016 08:37 AM, Kamil Paral wrote:
Do you have some other ideas/proposals, in general or in some specific situations regarding the slip length?
I'm wondering if there would be interest in hosting a file containing upgrade requirements for each version. For example it could have the package version requirements needed for a successful upgrade. The upgrade tool could check that and warn the user.
One of my concerns is the state of ca-legacy, and whether and how this gets disabled on upgrades. I'm sure there are some other things that have ex post facto unsafe defaults that just stick around through upgrades rather than being reset to new defaults. In my opinion that would violate the Workstation PRD "Upgrading the system multiple times through the upgrade process should give a result that is the same as an original install of Fedora Workstation."
This all seems out of scope, as Kamil said. Can we please stick to the non-media blocker policy discussion here? General concerns / ideas for upgrades, and specific potential upgrade issues, should get their own threads.
It's very likely true that upgraded systems get increasingly out of whack with freshly installed ones when it comes to default configurations of various packages - especially ones which don't use the modular, multiply-overridden configuration style, and thus can't easily update the distribution defaults post-install - but this doesn't really seem to have much to do with the question of what the release process policies WRT non-media blockers should be.
Yep, it's true my was vaguely, unintentionally, in the vicinity of, a drive-by / hijacking.
Hello folks,
I'd like to return to the high-level overview for this topic and discuss the changes we plan to do in our SOPs.
Since there was not much feedback, we agreed on a QA meeting that I'll just go ahead and draft all related SOP changes. Here we go:
1. SOP blocker bug process: https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3ASOP_bloc...
I've clarified that we should not close any blocker bug until the fix is pushed to a stable repo *and* also part of a TC/RC (if appropriate). If this is kept, it means we can easily look at the list of blockers and decide whether any type of blocker is still blocking us (some blocker bugs open) or not (all blocker bugs closed). This will be important in other SOPs.
Then I added the process of tracking AcceptedPreviousRelease fixes and verifying that the related updates repo metalink is in shape.
This document shares many templates with https://fedoraproject.org/wiki/QA:SOP_freeze_exception_bug_process , but I do not intend to modify that one, so I might need you help, Adam, to modify the templates in such a way we adjust only one of the documents.
2. Go No Go Meeting https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AGo_No_Go...
I wanted to avoid enumerating different types of blockers and their conditions here, so I use the previously described fact that any open blocker bugs should mean No Go, otherwise it means Go. But since people are not machines and mistakes will happen, I used "open or otherwise unaddressed accepted blocker bugs" to cover the case where we closed some bug sooner than it should have been, and it's still not addressed.
I also switched GOLD to GO, which seems to be an oversight from the past.
I went through these documents and found them not needing any adjustments: https://fedoraproject.org/wiki/QA:SOP_Blocker_Bug_Meeting https://fedoraproject.org/wiki/QA:SOP_compose_request https://fedoraproject.org/wiki/Blocker_Bug_FAQ
Do the changes look OK to you?
Thanks, Kamil
On Mon, 2016-02-29 at 04:42 -0500, Kamil Paral wrote:
Hello folks,
I'd like to return to the high-level overview for this topic and discuss the changes we plan to do in our SOPs.
Since there was not much feedback, we agreed on a QA meeting that I'll just go ahead and draft all related SOP changes. Here we go:
- SOP blocker bug process:
https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AS OP_blocker_bug_process&diff=436236&oldid=435641
I've clarified that we should not close any blocker bug until the fix is pushed to a stable repo *and* also part of a TC/RC (if appropriate). If this is kept, it means we can easily look at the list of blockers and decide whether any type of blocker is still blocking us (some blocker bugs open) or not (all blocker bugs closed). This will be important in other SOPs.
Yeah, this looks fine. The current text wasn't even really considering the case of 'pushed stable but not included in a compose' as things usually happen the other way around (and we specifically had cases where fixes got included in a compose but not pushed stable, the bug was closed, and we never remembered to push the fix to stable).
Then I added the process of tracking AcceptedPreviousRelease fixes and verifying that the related updates repo metalink is in shape.
This generally looks fine, my only concern is that it's extremely specific stuff that might go stale quite easily. But since it's not at all an 'obvious' process, explaining it in detail is important.
It would be nice of course if there was a tool for doing this, then the text could be reduced to 'run the magic tool and make sure it says OK'.
This document shares many templates with https://fedoraproject.org/wiki/QA:SOP_freeze_exception_bug_process , but I do not intend to modify that one, so I might need you help, Adam, to modify the templates in such a way we adjust only one of the documents.
This shouldn't really be too difficult. The text in question is all a part of a single template, I believe:
https://fedoraproject.org/wiki/Template:Blocker_freeze_exception_tracking
the first change applies equally to FE bugs, so there's no problem there, just apply the change straight to the template.
The second chunk is entirely inapplicable to the FE process as the concept of "non-media freeze exceptions" makes no sense and thus is not a thing; this actually makes things simple, because it means you can just dump that text straight into https://fedoraproject.org/wiki/QA:SOP_blocker_bug_process%C2%A0right after the text "{{Blocker_freeze_exception_tracking|type=blocker}}". You'll notice that I added the previous chunks about non-media blockers directly to the blocker SOP page too, for the same reason: they're completely inapplicable to the FE process and always will be, so it doesn't make any sense to put them in the shared templates and conditionalize them not to appear for the blocker page, it's simpler to just put them straight into the blocker page.
- Go No Go Meeting
https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AG o_No_Go_Meeting&diff=436242&oldid=435628
I wanted to avoid enumerating different types of blockers and their conditions here, so I use the previously described fact that any open blocker bugs should mean No Go, otherwise it means Go. But since people are not machines and mistakes will happen, I used "open or otherwise unaddressed accepted blocker bugs" to cover the case where we closed some bug sooner than it should have been, and it's still not addressed.
IIRC the text in fact uses "unaddressed" specifically *instead* of saying "open", as a slight fudge for cases where a bug might still be open but is in fact fully 'addressed'. We *are* reducing the likelihood of that scenario with this change (i.e. we can't say "go" if a fix is in the compose but not yet pushed stable any more), but I'm not 100% sure we've removed any possibility of a bug being in this state somehow. So I'm not 100% against this change but a bit worried by it.
I also switched GOLD to GO, which seems to be an oversight from the past.
It's not, exactly. The two terms sort of coexist, it wasn't just that we switched from saying "gold" to saying "go" at some point. Conceptually it's the *release candidate* specifically that gets declared "gold", while the *release process* is "go" (or we are "go for release") if the candidate is declared "gold". I think we could at least *conceptually* declare a release candidate "GOLD" but not be "go" for release. It's kinda unnecessary to have both concepts, but the text reads slightly awkwardly if you simply do s/GOLD/GO/g/ as you did, because we don't really "declare the release "GO"", that's a somewhat odd phrasing.
I don't really mind if we want to rephrase this a bit to drop the 'gold' concept - we barely use it anywhere else but on this page - but it feels like it should be a separate change, not part of this revision, and it should be slightly more than just a search-n-replace so the text doesn't read weird :)
Do the changes look OK to you?
Aside from the above notes, yup! Thanks for working on this.
Then I added the process of tracking AcceptedPreviousRelease fixes and verifying that the related updates repo metalink is in shape.
This generally looks fine, my only concern is that it's extremely specific stuff that might go stale quite easily. But since it's not at all an 'obvious' process, explaining it in detail is important.
It would be nice of course if there was a tool for doing this, then the text could be reduced to 'run the magic tool and make sure it says OK'.
I'll do some thinking whether I can write such a magical tool.
- Go No Go Meeting
https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AG o_No_Go_Meeting&diff=436242&oldid=435628
I wanted to avoid enumerating different types of blockers and their conditions here, so I use the previously described fact that any open blocker bugs should mean No Go, otherwise it means Go. But since people are not machines and mistakes will happen, I used "open or otherwise unaddressed accepted blocker bugs" to cover the case where we closed some bug sooner than it should have been, and it's still not addressed.
IIRC the text in fact uses "unaddressed" specifically *instead* of saying "open", as a slight fudge for cases where a bug might still be open but is in fact fully 'addressed'. We *are* reducing the likelihood of that scenario with this change (i.e. we can't say "go" if a fix is in the compose but not yet pushed stable any more), but I'm not 100% sure we've removed any possibility of a bug being in this state somehow. So I'm not 100% against this change but a bit worried by it.
OK, so what about this? https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AGo_No_Go...
I also switched GOLD to GO, which seems to be an oversight from the past.
It's not, exactly. The two terms sort of coexist, it wasn't just that we switched from saying "gold" to saying "go" at some point. Conceptually it's the *release candidate* specifically that gets declared "gold", while the *release process* is "go" (or we are "go for release") if the candidate is declared "gold". I think we could at least *conceptually* declare a release candidate "GOLD" but not be "go" for release. It's kinda unnecessary to have both concepts, but the text reads slightly awkwardly if you simply do s/GOLD/GO/g/ as you did, because we don't really "declare the release "GO"", that's a somewhat odd phrasing.
I don't really mind if we want to rephrase this a bit to drop the 'gold' concept - we barely use it anywhere else but on this page - but it feels like it should be a separate change, not part of this revision, and it should be slightly more than just a search-n-replace so the text doesn't read weird :)
OK, I reverted this.
On Tue, 2016-03-01 at 06:28 -0500, Kamil Paral wrote:
IIRC the text in fact uses "unaddressed" specifically *instead* of saying "open", as a slight fudge for cases where a bug might still be open but is in fact fully 'addressed'. We *are* reducing the likelihood of that scenario with this change (i.e. we can't say "go" if a fix is in the compose but not yet pushed stable any more), but I'm not 100% sure we've removed any possibility of a bug being in this state somehow. So I'm not 100% against this change but a bit worried by it.
OK, so what about this? https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AGo_No_Go...
Looks fine.
I also switched GOLD to GO, which seems to be an oversight from the past.
It's not, exactly. The two terms sort of coexist, it wasn't just that we switched from saying "gold" to saying "go" at some point. Conceptually it's the *release candidate* specifically that gets declared "gold", while the *release process* is "go" (or we are "go for release") if the candidate is declared "gold". I think we could at least *conceptually* declare a release candidate "GOLD" but not be "go" for release. It's kinda unnecessary to have both concepts, but the text reads slightly awkwardly if you simply do s/GOLD/GO/g/ as you did, because we don't really "declare the release "GO"", that's a somewhat odd phrasing.
I don't really mind if we want to rephrase this a bit to drop the 'gold' concept - we barely use it anywhere else but on this page - but it feels like it should be a separate change, not part of this revision, and it should be slightly more than just a search-n-replace so the text doesn't read weird :)
OK, I reverted this.
Thanks! So for me this is good to go.
It would be nice of course if there was a tool for doing this, then the text could be reduced to 'run the magic tool and make sure it says OK'.
I'll do some thinking whether I can write such a magical tool.
Here we go: https://git.fedorahosted.org/cgit/fedora-qa.git/tree/track-previous-release-...
A bit more complicated than I expected. But now we can simplify the instructions to this: https://fedoraproject.org/w/index.php?title=User:Kparal/Draft:SOP_blocker_bu...
What do you think?
The tool itself prints output like this:
$ ./track-previous-release-blocker.py tmw-20130201-6.fc23 INFO Querying Koji for tmw-20130201-6.fc23 in f23-updates ... INFO Build tmw-20130201-6.fc23 was tagged into f23-updates at: 2016-02-29 18:03:33 UTC (1456769013.68) INFO Downloading metalink for updates-released-f23 ... INFO Metalink contains metadata with these timestamps: 2016-02-29 04:47:02 UTC (1456721222.0) ✘ older than pushed package 2016-02-29 21:44:02 UTC (1456782242.0) ✔ sufficiently recent WARNING ✘ FAILED Some metadata referenced in metalink is still older than the time when tmw-20130201-6.fc23 was tagged into f23-updates. Some users would not receive this update if they chose to update now.
$ ./track-previous-release-blocker.py dmidecode-3.0-1.fc23 INFO Querying Koji for dmidecode-3.0-1.fc23 in f23-updates ... INFO Build dmidecode-3.0-1.fc23 was tagged into f23-updates at: 2016-01-24 23:02:17 UTC (1453676537.74) INFO Downloading metalink for updates-released-f23 ... INFO Metalink contains metadata with these timestamps: 2016-02-29 04:47:02 UTC (1456721222.0) ✔ sufficiently recent 2016-02-29 21:44:02 UTC (1456782242.0) ✔ sufficiently recent INFO ✔ PASSED All metadata referenced in metalink is sufficiently newer than the time when dmidecode-3.0-1.fc23 was tagged into f23-updates. All users should be able to receive the update now.
$ ./track-previous-release-blocker.py datovka-4.5.0-1.fc23 --metalink ~/tmp/metalink.xml # hacked for demonstration purposes INFO Querying Koji for datovka-4.5.0-1.fc23 in f23-updates ... INFO Build datovka-4.5.0-1.fc23 was tagged into f23-updates at: 2016-02-25 10:56:37 UTC (1456397797.11) INFO Metalink contains metadata with these timestamps: 2016-03-02 13:29:06 UTC (1456925346.0) ✘ DNF cache not yet expired (until 2016-03-02 19:29:06 UTC) WARNING ✘ FAILED All metadata referenced in metalink is newer than the time when datovka-4.5.0-1.fc23 was tagged into f23-updates. However, it still hasn't been at least 6 hours (the default DNF metadata expire duration) since any of the listed metadata were pushed, and so some users would not receive this update if they chose to update now. Please wait until the expiration time listed above.
$ ./track-previous-release-blocker.py sendmail-8.15.2-2.fc22 INFO Querying Koji for sendmail-8.15.2-2.fc22 in f22-updates ... WARNING It seems that sendmail-8.15.2-2.fc22 hasn't yet been tagged into f22-updates! Please verify that your NVR is correct or that something else is not wrong. Hint: You can see complete tag history by running: koji list-tag-history --build sendmail-8.15.2-2.fc22
It also has a --debug option if needed.
It would be nice of course if there was a tool for doing this, then the text could be reduced to 'run the magic tool and make sure it says OK'.
I'll do some thinking whether I can write such a magical tool.
Here we go: https://git.fedorahosted.org/cgit/fedora-qa.git/tree/track-previous-release-...
Also, we could run this from blockerbugs and track it automatically (for example not hiding even closed previousrelease blockers, until they satisfy that condition). The script itself can be easily adjusted for that. But I'm not convinced this is worth the effort on the blockerbugs side, I imagine it could require a few non-trivial changes. I don't expect this tool to be used that often even manually. So I'd keep it like this (manual execution) for this cycle, we'll see how often we need it, and then we can decide whether blockerbugs integration is worth the effort.
Thoughts?
On Wed, 2016-03-02 at 09:45 -0500, Kamil Paral wrote:
It would be nice of course if there was a tool for doing this, then the text could be reduced to 'run the magic tool and make sure it says OK'.
I'll do some thinking whether I can write such a magical tool.
Here we go: https://git.fedorahosted.org/cgit/fedora-qa.git/tree/track-previous-release-...
#!/usr/bin/env python2
there's no excuse for this. ;)
What's the deal with making args.nvr a list then requiring it to be exactly one item long?
A bit more complicated than I expected. But now we can simplify the instructions to this: https://fedoraproject.org/w/index.php?title=User:Kparal/Draft:SOP_blocker_bu...
What do you think?
+1! And of course eventually we can integrate it into making this process more automated.
Here we go: https://git.fedorahosted.org/cgit/fedora-qa.git/tree/track-previous-release-...
#!/usr/bin/env python2
there's no excuse for this. ;)
I'll do better in future, I promise! :)
What's the deal with making args.nvr a list then requiring it to be exactly one item long?
I thought that's how argparser always works, returning a list for positional arguments. And it does it only if you use nargs. Thanks, fixed.
A bit more complicated than I expected. But now we can simplify the instructions to this: https://fedoraproject.org/w/index.php?title=User:Kparal/Draft:SOP_blocker_bu...
What do you think?
+1!
OK, I'll push the changes live.
OK, I'll push the changes live.
The changes are now live: https://fedoraproject.org/w/index.php?title=QA%3ASOP_blocker_bug_process&... https://fedoraproject.org/w/index.php?title=Template%3ABlocker_freeze_except... https://fedoraproject.org/w/index.php?title=Go_No_Go_Meeting&diff=436729...