Just a note - we don't need an increased timeout, if we deploy newer python-yamlish. That could have been done a long time ago, but we forgot to update the spec file, so nobody found out. That's why I tried to fix it in https://phab.qadevel.cloud.fedoraproject.org/D337 . Once that is accepted, the next libtaskotron build will require the newer python-yamlish.
----- Forwarded Message ----- From: "Tim Flink" tflink@redhat.com To: infrastructure@lists.fedoraproject.org Sent: Thursday, April 9, 2015 11:42:43 PM Subject: Freeze Break Request: Increase execution timeouts in taskotron's buildbot
As the set of packages in f22 has grown with freeze, some of the tasks (most often depcheck) are not completing before hitting the default timeout of 20 minutes for execution in buildbot.
I want to double the timeout for task execution from 20 to 40 minutes. I've made the change in dev and stg already and the change works - long tasks are no longer being killed prior to completion. This freeze break is for production.
+1s?
Tim
diff --git a/roles/taskotron/buildmaster-configure/templates/taskotron.master.cfg.j2 b/roles/taskotron/bu index d7a698f..1a63b0e 100644 --- a/roles/taskotron/buildmaster-configure/templates/taskotron.master.cfg.j2 +++ b/roles/taskotron/buildmaster-configure/templates/taskotron.master.cfg.j2 @@ -175,9 +175,7 @@ factory.addStep(ShellCommand(command=["runtask", Interpolate('%(prop:taskname)s.yml')], descriptionDone=[Interpolate('%(prop:taskname)s on %(prop:item)s')], name='runtask', -{% if deployment_type in ['dev', 'stg'] %} timeout=2400, -{% endif %} logfiles={'taskotron.log': {'filename': '/var/log/taskotron/taskotron.log',
_______________________________________________ infrastructure mailing list infrastructure@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/infrastructure
On Thu, 16 Apr 2015 03:01:07 -0400 (EDT) Kamil Paral kparal@redhat.com wrote:
Just a note - we don't need an increased timeout, if we deploy newer python-yamlish. That could have been done a long time ago, but we forgot to update the spec file, so nobody found out. That's why I tried to fix it in https://phab.qadevel.cloud.fedoraproject.org/D337 . Once that is accepted, the next libtaskotron build will require the newer python-yamlish.
I'd argue that we do need to get the increased timeout working but that doesn't negate your point about python-yamlish.
However, there are two problems: 1. there's no newer python-yamlish package to update to - that's something that I wrote the specfile for and isn't in the fedora repos, as far as I know.
2. we can't deploy a newer libtaskotron in production yet - that'll bring in stuff that I don't want to push into production yet.
That being said, I've started a build of python-yamlish-0.18 and will get that pushed out to dev/stg later today.
Tim
----- Forwarded Message ----- From: "Tim Flink" tflink@redhat.com To: infrastructure@lists.fedoraproject.org Sent: Thursday, April 9, 2015 11:42:43 PM Subject: Freeze Break Request: Increase execution timeouts in taskotron's buildbot
As the set of packages in f22 has grown with freeze, some of the tasks (most often depcheck) are not completing before hitting the default timeout of 20 minutes for execution in buildbot.
I want to double the timeout for task execution from 20 to 40 minutes. I've made the change in dev and stg already and the change works - long tasks are no longer being killed prior to completion. This freeze break is for production.
+1s?
Tim
diff --git a/roles/taskotron/buildmaster-configure/templates/taskotron.master.cfg.j2 b/roles/taskotron/bu index d7a698f..1a63b0e 100644 --- a/roles/taskotron/buildmaster-configure/templates/taskotron.master.cfg.j2 +++ b/roles/taskotron/buildmaster-configure/templates/taskotron.master.cfg.j2 @@ -175,9 +175,7 @@ factory.addStep(ShellCommand(command=["runtask", Interpolate('%(prop:taskname)s.yml')], descriptionDone=[Interpolate('%(prop:taskname)s on %(prop:item)s')], name='runtask', -{% if deployment_type in ['dev', 'stg'] %} timeout=2400, -{% endif %} logfiles={'taskotron.log': {'filename': '/var/log/taskotron/taskotron.log',
infrastructure mailing list infrastructure@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/infrastructure _______________________________________________ qa-devel mailing list qa-devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/qa-devel
I'd argue that we do need to get the increased timeout working but that doesn't negate your point about python-yamlish.
However, there are two problems:
there's no newer python-yamlish package to update to - that's something that I wrote the specfile for and isn't in the fedora repos, as far as I know.
we can't deploy a newer libtaskotron in production yet - that'll bring in stuff that I don't want to push into production yet.
That being said, I've started a build of python-yamlish-0.18 and will get that pushed out to dev/stg later today.
Yeah, I was referring to the particular issues we were seeing in the past - those should be resolved by a newer yamlish. Thanks for the update. There are still some other issues for which the increased timeout will be useful, until we have a better way to deal with them. I have reviewed the 2-hour-long upgradepath runs yesterday - the main check including all Koji communication takes 10 minutes, the rest is communication with Bodhi. 2 hours of querying Bodhi! It's absolutely insane. Of course we don't usually have 550 builds to check, but we will need to do something about it, otherwise we'll have issues every time there is a freeze. And we might be even partly responsible for Bodhi being so overloaded lately.
qa-devel@lists.fedoraproject.org