Hello everyone,
I'm Anaconda developer and I'm also taking care about our infrastructure and this Fedora release brought me a plenty of "unnecessary" work thanks to the fact that compose for Fedora 31 was not available until a week before beta freeze. That is too late. I wasn't the only one who had these problems, copr had issues for Fedora 31 and couldn't enable chroot so they had to do changes to correct these broken things. And I'm not talking about Fedora QA team which couldn't test almost anything before beta freeze.
The problem is that when we don't have a compose we don't have packages for testing and then more and more changes are getting in but we are not able to check if they are working. If we don't have packages the mock can't properly work and you are not able to do a system upgrade. The only test point is compose but that is just a small portion. Not being able to test Fedora for a few weeks is situation which should not happen.
To make things even worse there was a switch to python 3.8 on Rawhide which wasn't really prepared (pylint did not worked). So for a few days we were with broken Fedora 31 and Rawhide too, so most of our tests were not working. I would really said that we were programming in the dark. No tests, no check that changes are working. It took me almost a week to make everything working again not talking about time spend waiting for the compose to be available.
I want to ask for an improvement here. Ideal solution for me would be to add rule that there have to be compose to do the branching and if the compose fails then the branching won't happen. Not sure if this is doable or how hard it would be to implement a similar rule, however, it would be an ultimate solution. Then, the compose blocker bugs had to be solved on Rawhide where they should be solved.
Please tell me what should I do next. Should I file a FESCO ticket to add this rule?
Best Regards, Jirka
On 17. 09. 19 15:58, jkonecny@redhat.com wrote:
To make things even worse there was a switch to python 3.8 on Rawhide which wasn't really prepared (pylint did not worked).
I trust that your intentions were only the best when you wrote your e-mail, but this statement kinda surprises me. The switch to Python 3.8 was very well prepared and I consider it one of the smoothest switches to a new Python version I can remember.
Pylint is never prepared for the switch. If something crucial is depending on pylint, it should stop.
On Tue, Sep 17, 2019, 16:28 Miro Hrončok mhroncok@redhat.com wrote:
On 17. 09. 19 15:58, jkonecny@redhat.com wrote:
To make things even worse there was a switch to python 3.8 on Rawhide which wasn't really prepared (pylint did not worked).
I trust that your intentions were only the best when you wrote your e-mail, but this statement kinda surprises me. The switch to Python 3.8 was very well prepared and I consider it one of the smoothest switches to a new Python version I can remember.
Pylint is never prepared for the switch. If something crucial is depending on pylint, it should stop.
Also, IIRC, the failing rawhide composes at the time weren't even caused by the switch to python 3.8, at least not exclusively.
Fabio
-- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On 17. 09. 19 16:33, Fabio Valentini wrote:
On Tue, Sep 17, 2019, 16:28 Miro Hrončok <mhroncok@redhat.com mailto:mhroncok@redhat.com> wrote:
On 17. 09. 19 15:58, jkonecny@redhat.com <mailto:jkonecny@redhat.com> wrote: > To make things even worse there was a switch to python 3.8 on Rawhide > which wasn't really prepared (pylint did not worked). I trust that your intentions were only the best when you wrote your e-mail, but this statement kinda surprises me. The switch to Python 3.8 was very well prepared and I consider it one of the smoothest switches to a new Python version I can remember. Pylint is never prepared for the switch. If something crucial is depending on pylint, it should stop.
Also, IIRC, the failing rawhide composes at the time weren't even caused by the switch to python 3.8, at least not exclusively.
There was one failed rawhide compose caused by 3.8 side tag [1]. This was resolved immediately and the compose was restarted. Since then, there was no 3.8 Python 3.8 related compose failure.
[1] https://pagure.io/releng/issue/8506
On Tue, 2019-09-17 at 16:33 +0200, Fabio Valentini wrote:
On Tue, Sep 17, 2019, 16:28 Miro Hrončok mhroncok@redhat.com wrote:
On 17. 09. 19 15:58, jkonecny@redhat.com wrote:
To make things even worse there was a switch to python 3.8 on
Rawhide
which wasn't really prepared (pylint did not worked).
I trust that your intentions were only the best when you wrote your e-mail, but
this statement kinda surprises me. The switch to Python 3.8 was very well
prepared and I consider it one of the smoothest switches to a new Python version
I can remember.
Pylint is never prepared for the switch. If something crucial is depending on
pylint, it should stop.
Also, IIRC, the failing rawhide composes at the time weren't even caused by the switch to python 3.8, at least not exclusively.
Fabio
Broken Rawhide composes are not the problem. When you have any compose in the history then you have at least something (even if it is not the newest). The problem is when you don't have any compose at all. Jirka
_______________________________________________devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Hi Miro,
No, I did not wanted to tell that the python 3.8 transition was badly executed. The problem was more about timing.
I was fighting with Fedora 31 and we had Rawhide as replacement for F31 tests because the environment is usually not that different. However, when the switch to python 3.8 happened, there were no compose ready for Fedora 31. In the end we were with broken tests and even broken backup tests solution. I had to found third solution.
On Tue, 2019-09-17 at 16:27 +0200, Miro Hrončok wrote:
On 17. 09. 19 15:58, jkonecny@redhat.com wrote:
To make things even worse there was a switch to python 3.8 on Rawhide which wasn't really prepared (pylint did not worked).
I trust that your intentions were only the best when you wrote your e-mail, but this statement kinda surprises me. The switch to Python 3.8 was very well prepared and I consider it one of the smoothest switches to a new Python version I can remember.
Pylint is never prepared for the switch. If something crucial is depending on pylint, it should stop.
However, when you opened this discussion. Why we are not waiting for pylint to be ready? Is there a reason? I mean pylint is one of the main linters available for python, so having python without working pylint looks kind of fragile to me.
I also want to tell you thanks a lot for doing all the work about python migration it's not an easy task and you are doing great work there!
Jirka
On 17. 09. 19 16:39, jkonecny@redhat.com wrote:
Hi Miro,
No, I did not wanted to tell that the python 3.8 transition was badly executed. The problem was more about timing.
I was fighting with Fedora 31 and we had Rawhide as replacement for F31 tests because the environment is usually not that different. However, when the switch to python 3.8 happened, there were no compose ready for Fedora 31. In the end we were with broken tests and even broken backup tests solution. I had to found third solution.
As soon as Fedora branches is IMHO actually the best timing for making rawhide different. It means that people might actually stop breaking branched for a while focusing on the rawhide differences :D
Either way, here are two ideas about what bothers you:
1) Stop caring about compose, test/build against koji
Add the local repo to your test system (mock/copr/whatnot):
https://kojipkgs.fedoraproject.org/repos/f31-build/latest/x86_64/
2) Maybe we should freeze immediately after we branch and only allow compose fixes to go in until the first successful branched compose?
Pylint is never prepared for the switch. If something crucial is depending on pylint, it should stop.
However, when you opened this discussion. Why we are not waiting for pylint to be ready? Is there a reason? I mean pylint is one of the main linters available for python, so having python without working pylint looks kind of fragile to me.
Because pylint is never ready in time. The upstream doesn't work on 3.X problems when we do (usually responding something like: "the current pylint doesn't support 3.X, 3.X support is planned for the next release"). And that is completely fine in my POV.
There are around 3000 Python 3 packages in Fedora. We carefully decide what packages we "wait" for. It's not a defined set, but mostly we look if the following works:
- default package set in blocking media - critpath - fedpkg, bodhi, koji, mock...
pylint simply isn't blocking anything "important". Having Python without working pylint is completely fine. Linters are optional. When you want to run tests, you don't need linters. If you disagree (and that's fine as well) and want to make this better next time, you can co-maintain pylint [1] and when we update to 3.9, you can provide patches. The Python Maintenance team will not (because we don't consider pylint important).
I also want to tell you thanks a lot for doing all the work about python migration it's not an easy task and you are doing great work there!
Thanks.
[1] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/...
On Tue, 2019-09-17 at 16:56 +0200, Miro Hrončok wrote:
On 17. 09. 19 16:39, jkonecny@redhat.com wrote:
Hi Miro,
No, I did not wanted to tell that the python 3.8 transition was badly executed. The problem was more about timing.
I was fighting with Fedora 31 and we had Rawhide as replacement for F31 tests because the environment is usually not that different. However, when the switch to python 3.8 happened, there were no compose ready for Fedora 31. In the end we were with broken tests and even broken backup tests solution. I had to found third solution.
As soon as Fedora branches is IMHO actually the best timing for making rawhide different. It means that people might actually stop breaking branched for a while focusing on the rawhide differences :D
Either way, here are two ideas about what bothers you:
- Stop caring about compose, test/build against koji
Add the local repo to your test system (mock/copr/whatnot):
https://kojipkgs.fedoraproject.org/repos/f31-build/latest/x86_64/
I'm not sure you want everyone to use this mirror for everything. I guess that is not purpose of this. Also you're loosing the benefit of the compose could be created check.
- Maybe we should freeze immediately after we branch and only allow
compose fixes to go in until the first successful branched compose?
That also looks like an interesting plan. That may help to get compose ready sooner.
Pylint is never prepared for the switch. If something crucial is depending on pylint, it should stop.
However, when you opened this discussion. Why we are not waiting for pylint to be ready? Is there a reason? I mean pylint is one of the main linters available for python, so having python without working pylint looks kind of fragile to me.
Because pylint is never ready in time. The upstream doesn't work on 3.X problems when we do (usually responding something like: "the current pylint doesn't support 3.X, 3.X support is planned for the next release"). And that is completely fine in my POV.
There are around 3000 Python 3 packages in Fedora. We carefully decide what packages we "wait" for. It's not a defined set, but mostly we look if the following works:
- default package set in blocking media
- critpath
- fedpkg, bodhi, koji, mock...
pylint simply isn't blocking anything "important". Having Python without working pylint is completely fine. Linters are optional. When you want to run tests, you don't need linters. If you disagree (and that's fine as well) and want to make this better next time, you can co-maintain pylint [1] and when we update to 3.9, you can provide patches. The Python Maintenance team will not (because we don't consider pylint important).
Interesting they don't use any linters or just using different one? Don't you know?
I also want to tell you thanks a lot for doing all the work about python migration it's not an easy task and you are doing great work there!
Thanks.
[1] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/...
On 17. 09. 19 15:58, jkonecny@redhat.com wrote:
I want to ask for an improvement here. Ideal solution for me would be to add rule that there have to be compose to do the branching and if the compose fails then the branching won't happen.
We need to branch before we do the branched compose. Even if we check that rawhide composes prior to branching, there is no guarantee that branched would compose right after it.
Ultimately, what happens if rawhide doesn't compose? You delay branching until it does? How does that help with "beta freeze happens too soon after the branched compose" problem?
On Tue, 2019-09-17 at 16:31 +0200, Miro Hrončok wrote:
On 17. 09. 19 15:58, jkonecny@redhat.com wrote:
I want to ask for an improvement here. Ideal solution for me would be to add rule that there have to be compose to do the branching and if the compose fails then the branching won't happen.
We need to branch before we do the branched compose. Even if we check that rawhide composes prior to branching, there is no guarantee that branched would compose right after it.
Are we able to make the compose on fedora-31-candidate in temporal tag and then make the branching based on result?
If that is not doable what about taking last Rawhide compose and mark that as first compose of newly branched Fedora? The only thing I'm asking for is to have a base ground which is not available right now.
Ultimately, what happens if rawhide doesn't compose? You delay branching until it does? How does that help with "beta freeze happens too soon after the branched compose" problem?
I guess that the freeze date could move based on the branching date?
The point is to avoid situation that you have to flick something without knowing what's inside and most importantly with a backup which is not available without the compose. Basically not having to make temporary solutions to pretend that you are testing branched Fedora everywhere.
For example one of the problems what happened was about our daily COPR builds. There were no chroot for Fedora 31. In the time when Fedora 31 chroot was added the last chroot from Rawhide was used as new base Fedora 31 chroot -- that behavior seems correct to me because Fedora is based Rawhide. However, Rawhide chroot had already much newer version than it's on Fedora 31 (thanks to the delay) and with python 3.8. That meant that we had to remove all our daily builds manually and run the build again just because there were no compose on the branching date.
I don't see a reason why that should happen. When there were no branch for Fedora 31 than the python 3.8 changes wouldn't land (or I expect that) and even if that would happen we would work with python 3.8 everywhere correctly. The python 3.8 is just our current situation but it could be anything - new gcc, git, make any of those could broke tests or builds.
Jirka
-- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On 17. 09. 19 17:00, jkonecny@redhat.com wrote:
If that is not doable what about taking last Rawhide compose and mark that as first compose of newly branched Fedora? The only thing I'm asking for is to have a base ground which is not available right now.
That is actually a nice proposal. I wonder whether it is technically possible. CCing (hopefully) relevant people.
----- Original Message -----
From: "Miro Hrončok" mhroncok@redhat.com To: devel@lists.fedoraproject.org Cc: "Mohan Boddu" mboddu@bhujji.com, "Lubomir Sedlar" lsedlar@redhat.com, "Kevin Fenzi" kevin@scrye.com Sent: Tuesday, September 17, 2019 5:04:10 PM Subject: Re: Add a rule to have a compose when Fedora branched
On 17. 09. 19 17:00, jkonecny@redhat.com wrote:
If that is not doable what about taking last Rawhide compose and mark that as first compose of newly branched Fedora? The only thing I'm asking for is to have a base ground which is not available right now.
That is actually a nice proposal. I wonder whether it is technically possible. CCing (hopefully) relevant people.
I personally think it should work. As long as the Rawhide compose you take and rebrand as F3X is older than the F3X branching date. That way all the fcs will be fc3x and there will not be any unexpected updated post branch Rawhide packages.
-- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On 9/17/19 8:04 AM, Miro Hrončok wrote:
On 17. 09. 19 17:00, jkonecny@redhat.com wrote:
If that is not doable what about taking last Rawhide compose and mark that as first compose of newly branched Fedora? The only thing I'm asking for is to have a base ground which is not available right now.
That is actually a nice proposal. I wonder whether it is technically possible. CCing (hopefully) relevant people.
It is not.
Branching is not just "oh, make a new compose". There's a ton of steps/work that happens then, including:
* Making a new branch on all active rpms * Switching to a new signing key in rawhide. * New pungi-fedora config, new comps, new kickstarts. * Setting up new koji tags, etc.
I'm sorry for the delay in a f31 compose this time. ;(
Here's my suggestions:
* Make sure branching isn't right after flock. Mohan was traveling and we were both jetlagged so I think it was harder to watch things.
* We should leverage rawhide gating in the next branched: Set it up for gating just like rawhide (this time we didn't) and then actually disable allowing new builds in until we have a compose. This would hsave saved us many days of people landing broken stuff we had to sort out. We could at least get a compose to have to start with. The next compose might get a pile, but at least we don't have to fight a moving target.
* somehow figure out the pungi-gather segfaulting issue and fix it. This doomed several composes.
* Now that we have composes somewhat faster, we can run 3-4 a day at least, so that should speed up fixing things.
* Stop rawhide composes until we have a branched compose. This may not be needed with the change to make rawhide use 'rawhide' and not the number, but we should consider it if we don't have a compose to avoid confusion.
kevin
Dne 17. 09. 19 v 18:28 Kevin Fenzi napsal(a):
Branching is not just "oh, make a new compose". There's a ton of steps/work that happens then, including:
- Making a new branch on all active rpms
- Switching to a new signing key in rawhide.
- New pungi-fedora config, new comps, new kickstarts.
- Setting up new koji tags, etc.
Is this process documented somewhere (Maitai, UML, jBPM)? I am only aware of Fedora release schedule. How many of these actions are triggered automatically and how many of them has to be started by hand?
On 9/18/19 12:02 AM, Miroslav Suchý wrote:
Dne 17. 09. 19 v 18:28 Kevin Fenzi napsal(a):
Branching is not just "oh, make a new compose". There's a ton of steps/work that happens then, including:
- Making a new branch on all active rpms
- Switching to a new signing key in rawhide.
- New pungi-fedora config, new comps, new kickstarts.
- Setting up new koji tags, etc.
Is this process documented somewhere (Maitai, UML, jBPM)? I am only aware of Fedora release schedule. How many of these actions are triggered automatically and how many of them has to be started by hand?
Releng has scripts for it: https://pagure.io/releng/blob/master/f/scripts/branching
and an SOP (although it needs some work):
https://docs.pagure.org/releng/sop_mass_branching.html
Most of the actions are human run. There's nothing really to trigger off of. If we had time we could possibly roll all of this into a ansible playbook or the like.
kevin
Hi Kevin, Thanks for the explanation. See my comments below.
On Tue, 2019-09-17 at 09:28 -0700, Kevin Fenzi wrote:
On 9/17/19 8:04 AM, Miro Hrončok wrote:
On 17. 09. 19 17:00, jkonecny@redhat.com wrote:
If that is not doable what about taking last Rawhide compose and mark that as first compose of newly branched Fedora? The only thing I'm asking for is to have a base ground which is not available right now.
That is actually a nice proposal. I wonder whether it is technically possible. CCing (hopefully) relevant people.
It is not.
Branching is not just "oh, make a new compose". There's a ton of steps/work that happens then, including:
- Making a new branch on all active rpms
- Switching to a new signing key in rawhide.
- New pungi-fedora config, new comps, new kickstarts.
- Setting up new koji tags, etc.
I'm sorry for the delay in a f31 compose this time. ;(
I don't think it's your fault or anyone else. I think it's a fault of the system here and that is what I want to fix.
Here's my suggestions:
- Make sure branching isn't right after flock. Mohan was traveling
and we were both jetlagged so I think it was harder to watch things.
Definitely good point!
- We should leverage rawhide gating in the next branched: Set it up
for gating just like rawhide (this time we didn't) and then actually disable allowing new builds in until we have a compose. This would hsave saved us many days of people landing broken stuff we had to sort out. We could at least get a compose to have to start with. The next compose might get a pile, but at least we don't have to fight a moving target.
- somehow figure out the pungi-gather segfaulting issue and fix it.
This doomed several composes.
- Now that we have composes somewhat faster, we can run 3-4 a day at
least, so that should speed up fixing things.
- Stop rawhide composes until we have a branched compose. This may
not be needed with the change to make rawhide use 'rawhide' and not the number, but we should consider it if we don't have a compose to avoid confusion.
Where should I signed this? :)
But now for real, from my understanding you are basically proposing improved version of what Miro mentioned somewhere here in the thread. That means make freeze after branching. I definitely agree on that.
Aside of that you are suggesting Rawhide should freeze too before the branched Fedora has a compose. Not sure if that would really help because if the Rawhide will *unfreeze* the same date as when the branched Fedora have first compose then we don't have a time to react. In my F31 case most importantly copr will be in similar situation that they will use Rawhide *new* compose (if they won't be really fast) instead of the old one for a new Fedora chroot. And I don't think we want to add some lag between the successful Fedora compose and the Rawhide one.
Other than this one point I agree with what you just wrote.
Jirka
On 9/18/19 1:41 AM, jkonecny@redhat.com wrote:
Hi Kevin, Thanks for the explanation. See my comments below.
...snip...
- Stop rawhide composes until we have a branched compose. This may
not be needed with the change to make rawhide use 'rawhide' and not the number, but we should consider it if we don't have a compose to avoid confusion.
Where should I signed this? :)
But now for real, from my understanding you are basically proposing improved version of what Miro mentioned somewhere here in the thread. That means make freeze after branching. I definitely agree on that.
Well, I was not saying we should go into Beta freeze immediately and stay in it until Beta is out. I was just saying we should have a temporary freeze right after branching until we have a completed branched compose. Then we can open things up again until Beta freeze.
Aside of that you are suggesting Rawhide should freeze too before the branched Fedora has a compose. Not sure if that would really help because if the Rawhide will *unfreeze* the same date as when the branched Fedora have first compose then we don't have a time to react.
Well, I am not sure we need to stop rawhide until we have a branched compose. This time it would have helped because mirrormanager pointed all the branched repos to rawhide since there was no branched, so it caused a lot of confusion.
if we can land the change to make rawhide use 'rawhide' as it's version, it would avoid this confusion, as people wanted to switch to branched would need to explicitly sync to it (which means it has to exist).
In my F31 case most importantly copr will be in similar situation that they will use Rawhide *new* compose (if they won't be really fast) instead of the old one for a new Fedora chroot. And I don't think we want to add some lag between the successful Fedora compose and the Rawhide one.
I'm not sure what you mean here, can you expand on it or provide an example?
Other than this one point I agree with what you just wrote.
Jirka
kevin
On 19. 09. 19 23:29, Kevin Fenzi wrote:
- Stop rawhide composes until we have a branched compose. This may
not be needed with the change to make rawhide use 'rawhide' and not the number, but we should consider it if we don't have a compose to avoid confusion.
Where should I signed this?:)
But now for real, from my understanding you are basically proposing improved version of what Miro mentioned somewhere here in the thread. That means make freeze after branching. I definitely agree on that.
Well, I was not saying we should go into Beta freeze immediately and stay in it until Beta is out. I was just saying we should have a temporary freeze right after branching until we have a completed branched compose. Then we can open things up again until Beta freeze.
To clarify: That's exactly what I meant.
On 9/19/19 3:33 PM, Miro Hrončok wrote:
On 19. 09. 19 23:29, Kevin Fenzi wrote:
- Stop rawhide composes until we have a branched compose. This may
not be needed with the change to make rawhide use 'rawhide' and not the number, but we should consider it if we don't have a compose to avoid confusion.
Where should I signed this?:)
But now for real, from my understanding you are basically proposing improved version of what Miro mentioned somewhere here in the thread. That means make freeze after branching. I definitely agree on that.
Well, I was not saying we should go into Beta freeze immediately and stay in it until Beta is out. I was just saying we should have a temporary freeze right after branching until we have a completed branched compose. Then we can open things up again until Beta freeze.
To clarify: That's exactly what I meant.
ok. That does give developers 2 less weeks to finish up any changes (or get FE/blockers for them), but we could do that, sure.
What does everyone else think?
kevin
On Fri, Sep 20, 2019 at 12:19 PM Kevin Fenzi kevin@scrye.com wrote:
On 9/19/19 3:33 PM, Miro Hrončok wrote:
On 19. 09. 19 23:29, Kevin Fenzi wrote:
- Stop rawhide composes until we have a branched compose. This may
not be needed with the change to make rawhide use 'rawhide' and not the number, but we should consider it if we don't have a compose to avoid confusion.
Where should I signed this?:)
But now for real, from my understanding you are basically proposing improved version of what Miro mentioned somewhere here in the thread. That means make freeze after branching. I definitely agree on that.
Well, I was not saying we should go into Beta freeze immediately and stay in it until Beta is out. I was just saying we should have a temporary freeze right after branching until we have a completed branched compose. Then we can open things up again until Beta freeze.
To clarify: That's exactly what I meant.
ok. That does give developers 2 less weeks to finish up any changes (or get FE/blockers for them), but we could do that, sure.
What does everyone else think?
I'm not really in favor of any more time compressions in the schedule than we already have. It's already pretty hard for me to get what I need done on time in Fedora. Losing the Alpha made things much worse for me personally, as that extra window is just now gone. Losing two more weeks and then going immediately into freeze would make things considerably more difficult, as we lose the window to make last minute corrections before Beta.
On 22. 09. 19 1:07, Neal Gompa wrote:
ok. That does give developers 2 less weeks to finish up any changes (or get FE/blockers for them), but we could do that, sure.
What does everyone else think?
I'm not really in favor of any more time compressions in the schedule than we already have. It's already pretty hard for me to get what I need done on time in Fedora. Losing the Alpha made things much worse for me personally, as that extra window is just now gone. Losing two more weeks and then going immediately into freeze would make things considerably more difficult, as we lose the window to make last minute corrections before Beta.
I don't quite understand where are the 2 weeks gone.
1. we branch. we freeze f31 until successful compose 2. compose runs, it fails, we fix it (things unblocking compose get exceptions) 3. repeat 2. 4. unfreeze
This can be any arbitrary amount of time. I would expect it to be couple days, not 2 weeks. Where is 2 weeks coming from?
Hi Kevin,
Could we please create an action item list for FESCO ticket?
I guess at the end we don't want to freeze Rawhide but we should have steps required for the branched Fedora freeze. Do you know how to do that or could you point someone here please?
Jirka
On Sun, 2019-09-22 at 10:48 +0200, Miro Hrončok wrote:
On 22. 09. 19 1:07, Neal Gompa wrote:
ok. That does give developers 2 less weeks to finish up any changes (or get FE/blockers for them), but we could do that, sure.
What does everyone else think?
I'm not really in favor of any more time compressions in the schedule than we already have. It's already pretty hard for me to get what I need done on time in Fedora. Losing the Alpha made things much worse for me personally, as that extra window is just now gone. Losing two more weeks and then going immediately into freeze would make things considerably more difficult, as we lose the window to make last minute corrections before Beta.
I don't quite understand where are the 2 weeks gone.
- we branch. we freeze f31 until successful compose
- compose runs, it fails, we fix it (things unblocking compose get
exceptions) 3. repeat 2. 4. unfreeze
This can be any arbitrary amount of time. I would expect it to be couple days, not 2 weeks. Where is 2 weeks coming from?
-- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On Thu, 2019-09-19 at 14:29 -0700, Kevin Fenzi wrote:
On 9/18/19 1:41 AM, jkonecny@redhat.com wrote:
Hi Kevin, Thanks for the explanation. See my comments below.
...snip...
- Stop rawhide composes until we have a branched compose. This
may not be needed with the change to make rawhide use 'rawhide' and not the number, but we should consider it if we don't have a compose to avoid confusion.
Where should I signed this? :)
But now for real, from my understanding you are basically proposing improved version of what Miro mentioned somewhere here in the thread. That means make freeze after branching. I definitely agree on that.
Well, I was not saying we should go into Beta freeze immediately and stay in it until Beta is out. I was just saying we should have a temporary freeze right after branching until we have a completed branched compose. Then we can open things up again until Beta freeze.
As Miro already answered, it's not about beta freeze but a new freeze you just described.
Aside of that you are suggesting Rawhide should freeze too before the branched Fedora has a compose. Not sure if that would really help because if the Rawhide will *unfreeze* the same date as when the branched Fedora have first compose then we don't have a time to react.
Well, I am not sure we need to stop rawhide until we have a branched compose. This time it would have helped because mirrormanager pointed all the branched repos to rawhide since there was no branched, so it caused a lot of confusion.
if we can land the change to make rawhide use 'rawhide' as it's version, it would avoid this confusion, as people wanted to switch to branched would need to explicitly sync to it (which means it has to exist).
In my F31 case most importantly copr will be in similar situation that they will use Rawhide *new* compose (if they won't be really fast) instead of the old one for a new Fedora chroot. And I don't think we want to add some lag between the successful Fedora compose and the Rawhide one.
I'm not sure what you mean here, can you expand on it or provide an example?
What COPR was doing (I think they changed it after F31) then before a new chroot in COPR appeared they copied the latest Rawhide chroot to have something working in the new chroot. So, if we build the new Rawhide the same day as F31 branch compose is ready then they don't have a time to react and sync "the old Rawhide" version.
However, that above is just an example of what happened to me. I think most of people here dealing with this stuff have to somehow react that the compose is available. So or so, you are trying to solve something else by the freeze than me.
I'm not sure If I can help you with the "Rawhide freeze" decision I didn't had the problem you are describing.
Jirka
On 9/20/19 4:39 AM, jkonecny@redhat.com wrote:
On Thu, 2019-09-19 at 14:29 -0700, Kevin Fenzi wrote:
On 9/18/19 1:41 AM, jkonecny@redhat.com wrote:
...snip...
In my F31 case most importantly copr will be in similar situation that they will use Rawhide *new* compose (if they won't be really fast) instead of the old one for a new Fedora chroot. And I don't think we want to add some lag between the successful Fedora compose and the Rawhide one.
I'm not sure what you mean here, can you expand on it or provide an example?
What COPR was doing (I think they changed it after F31) then before a new chroot in COPR appeared they copied the latest Rawhide chroot to have something working in the new chroot. So, if we build the new Rawhide the same day as F31 branch compose is ready then they don't have a time to react and sync "the old Rawhide" version.
I thought that was not manually copied, but just that way because we had no f31 compose, so f31 and rawhide were the same thing (due to mirrormanager).
However, that above is just an example of what happened to me. I think most of people here dealing with this stuff have to somehow react that the compose is available. So or so, you are trying to solve something else by the freeze than me.
I'm not sure If I can help you with the "Rawhide freeze" decision I didn't had the problem you are describing.
Sure.
kevin
On Friday, September 20, 2019 6:08:46 PM CEST Kevin Fenzi wrote:
On 9/20/19 4:39 AM, jkonecny@redhat.com wrote:
What COPR was doing (I think they changed it after F31) then before a new chroot in COPR appeared they copied the latest Rawhide chroot to have something working in the new chroot. So, if we build the new Rawhide the same day as F31 branch compose is ready then they don't have a time to react and sync "the old Rawhide" version.
I thought that was not manually copied, but just that way because we had no f31 compose, so f31 and rawhide were the same thing (due to mirrormanager).
What we were doing before was that we (a) copied Rawhide to branched, and (b) enabled the branched chroot -- but both actions mostly atomically at the same time (the (b) was always done immediately right after (a)).
This time (F31), for (b) to happen we needed to wait till there's F31 compose available (so we waited even with (a)). So we copied the Rawhide builds too late, many of them already had fc32 dist tag.
So for the next time, we'll try to do (a) first, immediately after the branching moment. And we'll wait for the branched compose with (b).
Naturally users will miss some builds in branched (those which happen between (a) and (b)), but problems caused by this should be slightly easier to resolve than what happened during F31->F32.
Pavel
On Sat, Sep 21, 2019 at 11:04:29PM +0200, Pavel Raiskup wrote:
On Friday, September 20, 2019 6:08:46 PM CEST Kevin Fenzi wrote:
On 9/20/19 4:39 AM, jkonecny@redhat.com wrote:
What COPR was doing (I think they changed it after F31) then before a new chroot in COPR appeared they copied the latest Rawhide chroot to have something working in the new chroot. So, if we build the new Rawhide the same day as F31 branch compose is ready then they don't have a time to react and sync "the old Rawhide" version.
I thought that was not manually copied, but just that way because we had no f31 compose, so f31 and rawhide were the same thing (due to mirrormanager).
What we were doing before was that we (a) copied Rawhide to branched, and (b) enabled the branched chroot -- but both actions mostly atomically at the same time (the (b) was always done immediately right after (a)).
This time (F31), for (b) to happen we needed to wait till there's F31 compose available (so we waited even with (a)). So we copied the Rawhide builds too late, many of them already had fc32 dist tag.
Ah, ok.
So for the next time, we'll try to do (a) first, immediately after the branching moment. And we'll wait for the branched compose with (b).
ok. Hopefully that will be much shorter this next time, and you won't have to worry about rawhide moving on if we pause those composes for branched to complete.
Naturally users will miss some builds in branched (those which happen between (a) and (b)), but problems caused by this should be slightly easier to resolve than what happened during F31->F32.
Yep.
kevin --
Pavel
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On Tue, Sep 17, 2019 at 12:30 PM Kevin Fenzi kevin@scrye.com wrote:
On 9/17/19 8:04 AM, Miro Hrončok wrote:
On 17. 09. 19 17:00, jkonecny@redhat.com wrote:
If that is not doable what about taking last Rawhide compose and mark that as first compose of newly branched Fedora? The only thing I'm asking for is to have a base ground which is not available right now.
That is actually a nice proposal. I wonder whether it is technically possible. CCing (hopefully) relevant people.
It is not.
Branching is not just "oh, make a new compose". There's a ton of steps/work that happens then, including:
- Making a new branch on all active rpms
- Switching to a new signing key in rawhide.
- New pungi-fedora config, new comps, new kickstarts.
- Setting up new koji tags, etc.
I'm sorry for the delay in a f31 compose this time. ;(
Here's my suggestions:
- Make sure branching isn't right after flock. Mohan was traveling and
we were both jetlagged so I think it was harder to watch things.
Yeah, I was traveling and on PTO for the first week after branching.
- We should leverage rawhide gating in the next branched: Set it up for
gating just like rawhide (this time we didn't) and then actually disable allowing new builds in until we have a compose. This would hsave saved us many days of people landing broken stuff we had to sort out. We could at least get a compose to have to start with. The next compose might get a pile, but at least we don't have to fight a moving target.
- somehow figure out the pungi-gather segfaulting issue and fix it. This
doomed several composes.
- Now that we have composes somewhat faster, we can run 3-4 a day at
least, so that should speed up fixing things.
- Stop rawhide composes until we have a branched compose. This may not
be needed with the change to make rawhide use 'rawhide' and not the number, but we should consider it if we don't have a compose to avoid confusion.
I agree with all the suggestions, but as far as I can tell, the issue is because everything happened all at once, pungi-gather segfaulting, python 3.8 landing in rawhide and may be others.
"Minimal Compose" which is on my plate for a long time would also help, which will build a image when certain packages gets built.
kevin
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On Tue, Sep 17, 2019 at 10:00 AM jkonecny@redhat.com wrote:
I want to ask for an improvement here. Ideal solution for me would be to add rule that there have to be compose to do the branching and if the compose fails then the branching won't happen. Not sure if this is doable or how hard it would be to implement a similar rule, however, it would be an ultimate solution. Then, the compose blocker bugs had to be solved on Rawhide where they should be solved.
I want to make sure I understand what you're proposing. You're suggesting that we shift from having the branch point be a defined, specific date in the schedule to having it be a range of possible options. So it would be more like the Beta and GA releases where we target a day, but slip if we're not ready. Is this right?
So how recent of a rawhide compose qualifies? If there's a successful compose in the last week, is that okay? Does it have to be the night before the branch point? Are there other judgments that go into (like in the go/no-go meeting) or is it entirely based on the compose?
My initial reaction is that this in an outlier and that we shouldn't make structural changes to the schedule in response to outliers, but I wonder if there are other guardrails we can put in place. Perhaps by making consistent compose failures more visible to the community?
Please tell me what should I do next. Should I file a FESCO ticket to add this rule?
I expect FESCo will want to wait until the community has had an opportunity to discuss this first. I suggest waiting a week or so (depending on how long the conversation remains productively active) and then filing a FESCo ticket.
On Tue, 2019-09-17 at 10:38 -0400, Ben Cotton wrote:
On Tue, Sep 17, 2019 at 10:00 AM jkonecny@redhat.com wrote:
I want to ask for an improvement here. Ideal solution for me would be to add rule that there have to be compose to do the branching and if the compose fails then the branching won't happen. Not sure if this is doable or how hard it would be to implement a similar rule, however, it would be an ultimate solution. Then, the compose blocker bugs had to be solved on Rawhide where they should be solved.
I want to make sure I understand what you're proposing. You're suggesting that we shift from having the branch point be a defined, specific date in the schedule to having it be a range of possible options. So it would be more like the Beta and GA releases where we target a day, but slip if we're not ready. Is this right?
Yes, I was thinking about that. However, that is only a proposal and maybe there is a better solution.
So how recent of a rawhide compose qualifies? If there's a successful compose in the last week, is that okay? Does it have to be the night before the branch point? Are there other judgments that go into (like in the go/no-go meeting) or is it entirely based on the compose?
My original idea was to run compose build for the new private tag and then make the branching with the new compose if it succeeded.
My initial reaction is that this in an outlier and that we shouldn't make structural changes to the schedule in response to outliers, but I wonder if there are other guardrails we can put in place. Perhaps by making consistent compose failures more visible to the community?
The main point is to have compose on the branching date or at least make the situation better than we have now. There are now other interesting proposals in the thread how to handle this.
Please tell me what should I do next. Should I file a FESCO ticket to add this rule?
I expect FESCo will want to wait until the community has had an opportunity to discuss this first. I suggest waiting a week or so (depending on how long the conversation remains productively active) and then filing a FESCo ticket.
Thanks for the suggestion.
Jirka
-- Ben Cotton He / Him / His Fedora Program Manager Red Hat TZ=America/Indiana/Indianapolis _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
----- Original Message -----
From: "Ben Cotton" bcotton@redhat.com To: "Development discussions related to Fedora" devel@lists.fedoraproject.org Sent: Tuesday, September 17, 2019 4:38:50 PM Subject: Re: Add a rule to have a compose when Fedora branched
On Tue, Sep 17, 2019 at 10:00 AM jkonecny@redhat.com wrote:
I want to ask for an improvement here. Ideal solution for me would be to add rule that there have to be compose to do the branching and if the compose fails then the branching won't happen. Not sure if this is doable or how hard it would be to implement a similar rule, however, it would be an ultimate solution. Then, the compose blocker bugs had to be solved on Rawhide where they should be solved.
I want to make sure I understand what you're proposing. You're suggesting that we shift from having the branch point be a defined, specific date in the schedule to having it be a range of possible options. So it would be more like the Beta and GA releases where we target a day, but slip if we're not ready. Is this right?
I understand the need for having reasonably specified set of dates so that people can plan their work around them. But simple triggering branching without any prior smoke test/dry run or CI & blindly following the dates regardless of the result does not seem like a good idea to me.
We could change the branching date to Not Earlier Than, branch on the date and only start the countdown for the outer milestones once there is a compose and possibly other pieces in place (Mock/Copr chroots, etc.) for people to actually get anything done on the branched release.
So how recent of a rawhide compose qualifies? If there's a successful compose in the last week, is that okay? Does it have to be the night before the branch point? Are there other judgments that go into (like in the go/no-go meeting) or is it entirely based on the compose?
My initial reaction is that this in an outlier and that we shouldn't make structural changes to the schedule in response to outliers, but I wonder if there are other guardrails we can put in place.
I'm afraid this is not an outlier as compose failures still happen very often, it was simply that we got a streak of them at the worst time possible. In other cases they might not be so visible, but still regularly cause pain to Fedora contributors and users.
To improve the situation there are basically two things to consider: - what development milestones should require a successful compose before we move forward with them - what breaks the compose & how we can avoid it, possibly by doing dry runs/CI runs on things before we introduce them to the compose cauldron
Perhaps by making consistent compose failures more visible to the community?
Please tell me what should I do next. Should I file a FESCO ticket to add this rule?
I expect FESCo will want to wait until the community has had an opportunity to discuss this first. I suggest waiting a week or so (depending on how long the conversation remains productively active) and then filing a FESCo ticket.
-- Ben Cotton He / Him / His Fedora Program Manager Red Hat TZ=America/Indiana/Indianapolis _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
FYI: FESCO ticket was created
https://pagure.io/fesco/issue/2246
On Tue, 2019-09-17 at 15:58 +0200, jkonecny@redhat.com wrote:
Hello everyone,
I'm Anaconda developer and I'm also taking care about our infrastructure and this Fedora release brought me a plenty of "unnecessary" work thanks to the fact that compose for Fedora 31 was not available until a week before beta freeze. That is too late. I wasn't the only one who had these problems, copr had issues for Fedora 31 and couldn't enable chroot so they had to do changes to correct these broken things. And I'm not talking about Fedora QA team which couldn't test almost anything before beta freeze.
The problem is that when we don't have a compose we don't have packages for testing and then more and more changes are getting in but we are not able to check if they are working. If we don't have packages the mock can't properly work and you are not able to do a system upgrade. The only test point is compose but that is just a small portion. Not being able to test Fedora for a few weeks is situation which should not happen.
To make things even worse there was a switch to python 3.8 on Rawhide which wasn't really prepared (pylint did not worked). So for a few days we were with broken Fedora 31 and Rawhide too, so most of our tests were not working. I would really said that we were programming in the dark. No tests, no check that changes are working. It took me almost a week to make everything working again not talking about time spend waiting for the compose to be available.
I want to ask for an improvement here. Ideal solution for me would be to add rule that there have to be compose to do the branching and if the compose fails then the branching won't happen. Not sure if this is doable or how hard it would be to implement a similar rule, however, it would be an ultimate solution. Then, the compose blocker bugs had to be solved on Rawhide where they should be solved.
Please tell me what should I do next. Should I file a FESCO ticket to add this rule?
Best Regards, Jirka
On Mon, Oct 14, 2019 at 11:59:30AM +0200, jkonecny@redhat.com wrote:
FYI: FESCO ticket was created
Yeah, and we had a bit more discussion there, which we probibly should have just had here. ;)
In particular bcotton asked how we avoid scheduling the branch date "right after" flock:
"This is a good idea, but we'll need to figure out what it actually means.
How do we define "right after"? The schedule is set well before Flock is planned, so would the expectation be that Flock is scheduled to avoid this or that we should adjust the release schedule after Flock is set? If we adjust the release schedule how does that impact other milestones? Right now, all schedule milestonesi are effectively anchored off the target release date. If we have to delay the branch point to accommodate Flock, do we move the release date out or do we compress other parts of the schedule?"
Igor said:
"I'm strongly against this as a rawhide user. We should not block rawhide by anything, we do snapshot of it at some point and stabilize it instead."
So, any thoughts on those? I think we could perhaps drop the 'stop rawhide composes until we get a branched' from the proposal anyhow, as hopefully before too long we will have rawhide calling itself 'rawhide' instead of the number and this should avoid a lot of the confusion that happend this cycle with branching not being composed yet and rawhide composing correctly.
I am not sure how to avoid the flock dates. I guess could we add this in as a factor on flock dates? We would have to communicate that to folks planning flock. (The CAIC?)
Lets discuss it here a bit more until we have a more concrete thing for fesco to vote on.
kevin
On Mon, 2019-10-21 at 11:19 -0700, Kevin Fenzi wrote:
On Mon, Oct 14, 2019 at 11:59:30AM +0200, jkonecny@redhat.com wrote:
FYI: FESCO ticket was created
Yeah, and we had a bit more discussion there, which we probibly should have just had here. ;)
In particular bcotton asked how we avoid scheduling the branch date "right after" flock:
"This is a good idea, but we'll need to figure out what it actually means.
How do we define "right after"? The schedule is set well before Flock is planned, so would the
expectation be that Flock is scheduled to avoid this or that we should adjust the release schedule after Flock is set? If we adjust the release schedule how does that impact other milestones? Right now, all schedule milestonesi are effectively anchored off the target release date. If we have to delay the branch point to accommodate Flock, do we move the release date out or do we compress other parts of the schedule?"
I guess it will be easier to just think about the branching date when Flock schedule is creating. However, I'm not familiar with the scheduling so I'm probably not the right person who should answer this.
Igor said:
"I'm strongly against this as a rawhide user. We should not block rawhide by anything, we do snapshot of it at some point and stabilize it instead."
So, any thoughts on those? I think we could perhaps drop the 'stop rawhide composes until we get a branched' from the proposal anyhow, as hopefully before too long we will have rawhide calling itself 'rawhide' instead of the number and this should avoid a lot of the confusion that happend this cycle with branching not being composed yet and rawhide composing correctly.
If the new compose stabilization will be only a few days (which is what we want to achieve here) then it shouldn't be a problem for Rawhide. On the other side it could be fine without the freeze but I would like to avoid problems when branching Fedora is pointing to a Rawhide which is more and more diverged from what will be in the new Fedora...
I am not sure how to avoid the flock dates. I guess could we add this in as a factor on flock dates? We would have to communicate that to folks planning flock. (The CAIC?)
Lets discuss it here a bit more until we have a more concrete thing for fesco to vote on.
kevin _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On Tue, Oct 22, 2019 at 02:09:04PM +0200, jkonecny@redhat.com wrote:
I guess it will be easier to just think about the branching date when Flock schedule is creating. However, I'm not familiar with the scheduling so I'm probably not the right person who should answer this.
Perhaps Ben Cotton could chime in here. I think now we have moved to planning the schedules years in advance, it's the flock schedule that moves around a lot based on when facilitys are available and other things. I guess we just need to take the branching date in account if it's looking like flock will be near it?
Igor said:
"I'm strongly against this as a rawhide user. We should not block rawhide by anything, we do snapshot of it at some point and stabilize it instead."
So, any thoughts on those? I think we could perhaps drop the 'stop rawhide composes until we get a branched' from the proposal anyhow, as hopefully before too long we will have rawhide calling itself 'rawhide' instead of the number and this should avoid a lot of the confusion that happend this cycle with branching not being composed yet and rawhide composing correctly.
If the new compose stabilization will be only a few days (which is what we want to achieve here) then it shouldn't be a problem for Rawhide. On the other side it could be fine without the freeze but I would like to avoid problems when branching Fedora is pointing to a Rawhide which is more and more diverged from what will be in the new Fedora...
Yeah, although this might go away as a problem because we hope to make rawhide call itself 'rawhide' instead of the number. This means people who are on 'rawhide' will stay on it by default, and will need to distro-sync to the branched version when it appears. In the f31 cycle mirrormanager had both f31 and f32 pointing to rawhide, for the next branching mirrormanager shouldn't do that.
kevin
On Mon, Oct 28, 2019 at 1:43 PM Kevin Fenzi kevin@scrye.com wrote:
On Tue, Oct 22, 2019 at 02:09:04PM +0200, jkonecny@redhat.com wrote:
I guess it will be easier to just think about the branching date when Flock schedule is creating. However, I'm not familiar with the scheduling so I'm probably not the right person who should answer this.
Perhaps Ben Cotton could chime in here. I think now we have moved to planning the schedules years in advance, it's the flock schedule that moves around a lot based on when facilitys are available and other things. I guess we just need to take the branching date in account if it's looking like flock will be near it?
I can! I had a nice conversation with mboddu last week when I was in Westford and the short answer is that there are no easy options. You're right that Flock will move around some based on facility availability, cost, etc. Now that the schedule is more predictable (or at least more explicitly stated), we can try to accommodate it in the Flock planning.
In the next week or so, I hope to publish a commblog post that will include a few different options and what impact those options will have on the schedule.
In the meantime, I'm curious about the history here. In my 10-ish years in the Fedora community prior to taking this job, I never really paid that much attention to the branch point. Is this a problem we've had in the past, or was F31 particularly bad. I know we get failed composes a lot, but my understanding is that this was a perfect storm.
On Mon, 2019-10-28 at 14:29 -0400, Ben Cotton wrote:
On Mon, Oct 28, 2019 at 1:43 PM Kevin Fenzi kevin@scrye.com wrote:
On Tue, Oct 22, 2019 at 02:09:04PM +0200, jkonecny@redhat.com wrote:
I guess it will be easier to just think about the branching date when Flock schedule is creating. However, I'm not familiar with the scheduling so I'm probably not the right person who should answer this.
Perhaps Ben Cotton could chime in here. I think now we have moved to planning the schedules years in advance, it's the flock schedule that moves around a lot based on when facilitys are available and other things. I guess we just need to take the branching date in account if it's looking like flock will be near it?
I can! I had a nice conversation with mboddu last week when I was in Westford and the short answer is that there are no easy options. You're right that Flock will move around some based on facility availability, cost, etc. Now that the schedule is more predictable (or at least more explicitly stated), we can try to accommodate it in the Flock planning.
In the next week or so, I hope to publish a commblog post that will include a few different options and what impact those options will have on the schedule.
In the meantime, I'm curious about the history here. In my 10-ish years in the Fedora community prior to taking this job, I never really paid that much attention to the branch point. Is this a problem we've had in the past, or was F31 particularly bad. I know we get failed composes a lot, but my understanding is that this was a perfect storm.
As far as I remember this problem is not new. It just got really problematic on F31. However, I remember that almost every release we have a gab between the branching and compose creation. Most of the time it's not that long. Please correct me Kevin if I'm not wrong here.
Jirka
On Tue, Oct 29, 2019 at 02:03:03PM +0100, jkonecny@redhat.com wrote:
As far as I remember this problem is not new. It just got really problematic on F31. However, I remember that almost every release we have a gab between the branching and compose creation. Most of the time it's not that long. Please correct me Kevin if I'm not wrong here.
You are not wrong. :) It has sometimes been a day or two, this time it was like a week... so it was much worse. Thats the case we want to avoid moving forward.
kevin
On Mon, 2019-10-28 at 14:29 -0400, Ben Cotton wrote:
On Mon, Oct 28, 2019 at 1:43 PM Kevin Fenzi kevin@scrye.com wrote:
On Tue, Oct 22, 2019 at 02:09:04PM +0200, jkonecny@redhat.com wrote:
I guess it will be easier to just think about the branching date when Flock schedule is creating. However, I'm not familiar with the scheduling so I'm probably not the right person who should answer this.
Perhaps Ben Cotton could chime in here. I think now we have moved to planning the schedules years in advance, it's the flock schedule that moves around a lot based on when facilitys are available and other things. I guess we just need to take the branching date in account if it's looking like flock will be near it?
I can! I had a nice conversation with mboddu last week when I was in Westford and the short answer is that there are no easy options. You're right that Flock will move around some based on facility availability, cost, etc. Now that the schedule is more predictable (or at least more explicitly stated), we can try to accommodate it in the Flock planning.
In the next week or so, I hope to publish a commblog post that will include a few different options and what impact those options will have on the schedule.
Hi Ben,
Could you please tell us if the post is published yet? And if so, could you please share the link?
Thanks, Jirka
In the meantime, I'm curious about the history here. In my 10-ish years in the Fedora community prior to taking this job, I never really paid that much attention to the branch point. Is this a problem we've had in the past, or was F31 particularly bad. I know we get failed composes a lot, but my understanding is that this was a perfect storm.
-- Ben Cotton He / Him / His Fedora Program Manager Red Hat TZ=America/Indiana/Indianapolis _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On Mon, Nov 4, 2019 at 8:13 AM jkonecny@redhat.com wrote:
Could you please tell us if the post is published yet? And if so, could you please share the link?
It is not. It will publish on Thursday. I'll share the link here when it goes up.
Here's the link to the Community Blog post that looks at the schedule options: https://communityblog.fedoraproject.org/accommodating-flock-in-the-release-s...
I have disabled comments on that post so that we can keep the conversation on this thread.
On 07. 11. 19 18:35, Ben Cotton wrote:
Here's the link to the Community Blog post that looks at the schedule options: https://communityblog.fedoraproject.org/accommodating-flock-in-the-release-s...
From the post:
I’m inclined to go with option 0, plus a brief freeze after branch.
This seems like best option to me as well.
On Thu, Nov 07, 2019 at 08:43:44PM +0100, Miro Hrončok wrote:
On 07. 11. 19 18:35, Ben Cotton wrote:
Here's the link to the Community Blog post that looks at the schedule options: https://communityblog.fedoraproject.org/accommodating-flock-in-the-release-s...
From the post:
I’m inclined to go with option 0, plus a brief freeze after branch.
This seems like best option to me as well.
Yeah, same here.
kevin
On Fri, 2019-11-08 at 07:20 -0800, Kevin Fenzi wrote:
On Thu, Nov 07, 2019 at 08:43:44PM +0100, Miro Hrončok wrote:
On 07. 11. 19 18:35, Ben Cotton wrote:
Here's the link to the Community Blog post that looks at the schedule options: https://communityblog.fedoraproject.org/accommodating-flock-in-the-release-s...
From the post:
I’m inclined to go with option 0, plus a brief freeze after branch.
This seems like best option to me as well.
Yeah, same here.
kevin
Ben, thanks a lot for your post. It's great summary of our possibilities!
I agree with you in general but I don't like the `brief` wording here. What that means exactly? I would rather go with specifying a strong freeze. Meaning that the freeze will continue until compose is available.
What do you think?
Jirka
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On Fri, Nov 8, 2019 at 11:54 AM jkonecny@redhat.com wrote:
I agree with you in general but I don't like the `brief` wording here. What that means exactly? I would rather go with specifying a strong freeze. Meaning that the freeze will continue until compose is available.
By "brief" I mean "only as long as it takes for a compose to become available" as opposed to "for a prescribed period of time." So I think we're in agreement here. Once releng has a successful branched compose, the freeze is lifted.
On Fri, 2019-11-08 at 12:16 -0500, Ben Cotton wrote:
On Fri, Nov 8, 2019 at 11:54 AM jkonecny@redhat.com wrote:
I agree with you in general but I don't like the `brief` wording here. What that means exactly? I would rather go with specifying a strong freeze. Meaning that the freeze will continue until compose is available.
By "brief" I mean "only as long as it takes for a compose to become available" as opposed to "for a prescribed period of time." So I think we're in agreement here. Once releng has a successful branched compose, the freeze is lifted.
In that case I'm in! :).
It looks that we all agreed on the solution here so I'll (re)open the FESCo ticket to enable the compose freeze.
Jirka
-- Ben Cotton He / Him / His Fedora Program Manager Red Hat TZ=America/Indiana/Indianapolis _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On 11. 11. 19 8:06, jkonecny@redhat.com wrote:
On Fri, 2019-11-08 at 12:16 -0500, Ben Cotton wrote:
On Fri, Nov 8, 2019 at 11:54 AM jkonecny@redhat.com wrote:
I agree with you in general but I don't like the `brief` wording here. What that means exactly? I would rather go with specifying a strong freeze. Meaning that the freeze will continue until compose is available.
By "brief" I mean "only as long as it takes for a compose to become available" as opposed to "for a prescribed period of time." So I think we're in agreement here. Once releng has a successful branched compose, the freeze is lifted.
In that case I'm in! :).
It looks that we all agreed on the solution here so I'll (re)open the FESCo ticket to enable the compose freeze.
Before you do this, can we figure out the details?
1. Do we use bodhi to handle the freeze?
Previously, this would not be possible, as bodhi was not yet "activated" at branching, however with rawhide gating (and hopefully "branched gating" soon), we could.
2. Who handles freeze exceptions, is it releng, QA?
3. How are the freeeze exceptions handled, via bugzilla? What are the tracker bugz? Or manually in https://pagure.io/releng/failed-composes/issues ?
On Mon, Nov 11, 2019 at 4:01 AM Miro Hrončok mhroncok@redhat.com wrote:
Before you do this, can we figure out the details?
- Do we use bodhi to handle the freeze?
Previously, this would not be possible, as bodhi was not yet "activated" at branching, however with rawhide gating (and hopefully "branched gating" soon), we could.
I'd defer to releng to determine what the best way to handle this is.
- Who handles freeze exceptions, is it releng, QA?
That's a great question. The point of this freeze isn't testing, it's getting a successful compose, so I would leave it in releng's hands. I don't see that it needs the freeze exception process that we use for releases. Of course QA and everyone else can help in determining what packages need to be updated to fix the compose.
- How are the freeeze exceptions handled, via bugzilla? What are the tracker
bugz? Or manually in https://pagure.io/releng/failed-composes/issues ?
There or in the main releng tracker. Whatever makes Mohan and Kevin happiest. (Or the least unhappy).
None of this is to throw the problem over the wall and say "it's releng's problem now!" It's more of "I trust them to come up with the best solution since they're the ones closest to the problem."
On Mon, Nov 11, 2019 at 09:51:33AM -0500, Ben Cotton wrote:
On Mon, Nov 11, 2019 at 4:01 AM Miro Hrončok mhroncok@redhat.com wrote:
Before you do this, can we figure out the details?
- Do we use bodhi to handle the freeze?
Previously, this would not be possible, as bodhi was not yet "activated" at branching, however with rawhide gating (and hopefully "branched gating" soon), we could.
I'd defer to releng to determine what the best way to handle this is.
When we do branching we adjust koji targets and tags. My thought was to do the adjustments, but disable some part of the workflow. Either collect packages in the tag to be signed, or collect them in the tag to be autosubmitted to gating by bodhi. Then, once a compose is done, restart that process and process the backlog. If a package is needed for a fix, it can manually be tagged in.
- Who handles freeze exceptions, is it releng, QA?
That's a great question. The point of this freeze isn't testing, it's getting a successful compose, so I would leave it in releng's hands. I don't see that it needs the freeze exception process that we use for releases. Of course QA and everyone else can help in determining what packages need to be updated to fix the compose.
I think this should be completely up to releng. They should only land things they need to get the compose working.
- How are the freeeze exceptions handled, via bugzilla? What are the tracker
bugz? Or manually in https://pagure.io/releng/failed-composes/issues ?
There or in the main releng tracker. Whatever makes Mohan and Kevin happiest. (Or the least unhappy).
releng tracker... but users should not make exceptions normally, it should only be things needed to make the compose work.
None of this is to throw the problem over the wall and say "it's releng's problem now!" It's more of "I trust them to come up with the best solution since they're the ones closest to the problem."
Yep. I agree...
kevin
On 11. 11. 19 18:23, Kevin Fenzi wrote:
On Mon, Nov 11, 2019 at 09:51:33AM -0500, Ben Cotton wrote:
On Mon, Nov 11, 2019 at 4:01 AM Miro Hrončok mhroncok@redhat.com wrote:
Before you do this, can we figure out the details?
- Do we use bodhi to handle the freeze?
Previously, this would not be possible, as bodhi was not yet "activated" at branching, however with rawhide gating (and hopefully "branched gating" soon), we could.
I'd defer to releng to determine what the best way to handle this is.
When we do branching we adjust koji targets and tags. My thought was to do the adjustments, but disable some part of the workflow. Either collect packages in the tag to be signed, or collect them in the tag to be autosubmitted to gating by bodhi. Then, once a compose is done, restart that process and process the backlog. If a package is needed for a fix, it can manually be tagged in.
- Who handles freeze exceptions, is it releng, QA?
That's a great question. The point of this freeze isn't testing, it's getting a successful compose, so I would leave it in releng's hands. I don't see that it needs the freeze exception process that we use for releases. Of course QA and everyone else can help in determining what packages need to be updated to fix the compose.
I think this should be completely up to releng. They should only land things they need to get the compose working.
- How are the freeeze exceptions handled, via bugzilla? What are the tracker
bugz? Or manually in https://pagure.io/releng/failed-composes/issues ?
There or in the main releng tracker. Whatever makes Mohan and Kevin happiest. (Or the least unhappy).
releng tracker... but users should not make exceptions normally, it should only be things needed to make the compose work.
None of this is to throw the problem over the wall and say "it's releng's problem now!" It's more of "I trust them to come up with the best solution since they're the ones closest to the problem."
Yep. I agree...
Excellent. I think this is ready to be presented to fesco for a vote.
If you think the change proposal is a better way then I'll create a draft and before sending it I'll contact you off the list to polish it.
Thanks a lot everyone for helping. You are the best!
Jirka
On Mon, 2019-11-11 at 19:16 +0100, Miro Hrončok wrote:
On 11. 11. 19 19:09, Miro Hrončok wrote:
Excellent. I think this is ready to be presented to fesco for a vot e.
Unless we decide it requires a change proposal. To limit the required bureaucracy, I can help draft it.
-- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Here's the draft. It would be great if you can check it out and do the appropriate edits to improve the Change.
https://fedoraproject.org/wiki/Changes/Freeze_after_branching_until_compose_...
Thanks Miro!
Jirka
On Mon, 2019-11-11 at 19:16 +0100, Miro Hrončok wrote:
On 11. 11. 19 19:09, Miro Hrončok wrote:
Excellent. I think this is ready to be presented to fesco for a vot e.
Unless we decide it requires a change proposal. To limit the required bureaucracy, I can help draft it.
-- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
On 13. 11. 19 14:12, jkonecny@redhat.com wrote:
Here's the draft. It would be great if you can check it out and do the appropriate edits to improve the Change.
https://fedoraproject.org/wiki/Changes/Freeze_after_branching_until_compose_...
I've made minor adjustments.
What is missing: A releng member as owner (Kevin maybe?). A releng ticket. Link to this thread.
On Wed, 2019-11-13 at 14:49 +0100, Miro Hrončok wrote:
On 13. 11. 19 14:12, jkonecny@redhat.com wrote:
Here's the draft. It would be great if you can check it out and do the appropriate edits to improve the Change.
https://fedoraproject.org/wiki/Changes/Freeze_after_branching_until_compose_...
I've made minor adjustments.
What is missing: A releng member as owner (Kevin maybe?). A releng ticket. Link to this thread.
releng member: Kevin do you want to be mentioned as owner contact there?
releng ticket: I'll create that as soon as the proposal will be ready.
Link to this thread is at the bottom of the proposal in the Documentation section.
On Wed, Nov 13, 2019 at 02:59:56PM +0100, jkonecny@redhat.com wrote:
On Wed, 2019-11-13 at 14:49 +0100, Miro Hrončok wrote:
On 13. 11. 19 14:12, jkonecny@redhat.com wrote:
Here's the draft. It would be great if you can check it out and do the appropriate edits to improve the Change.
https://fedoraproject.org/wiki/Changes/Freeze_after_branching_until_compose_...
I've made minor adjustments.
What is missing: A releng member as owner (Kevin maybe?). A releng ticket. Link to this thread.
releng member: Kevin do you want to be mentioned as owner contact there?
Sure.
releng ticket: I'll create that as soon as the proposal will be ready.
Link to this thread is at the bottom of the proposal in the Documentation section.
sounds good. Thanks for pushing this through.
kevin
Release Engineering issue is created and change is proposed.
Thanks everyone a lot for your help with the change and with finding the best solution!
Jirka
On Wed, 2019-11-13 at 12:40 -0800, Kevin Fenzi wrote:
On Wed, Nov 13, 2019 at 02:59:56PM +0100, jkonecny@redhat.com wrote:
On Wed, 2019-11-13 at 14:49 +0100, Miro Hrončok wrote:
On 13. 11. 19 14:12, jkonecny@redhat.com wrote:
Here's the draft. It would be great if you can check it out and do the appropriate edits to improve the Change.
https://fedoraproject.org/wiki/Changes/Freeze_after_branching_until_compose_...
I've made minor adjustments.
What is missing: A releng member as owner (Kevin maybe?). A releng ticket. Link to this thread.
releng member: Kevin do you want to be mentioned as owner contact there?
Sure.
releng ticket: I'll create that as soon as the proposal will be ready.
Link to this thread is at the bottom of the proposal in the Documentation section.
sounds good. Thanks for pushing this through.
kevin