Hey folks! Just so I'm not the only one who knows this, here's a quick update on how all the compose process magic I'm dealing with is working now.
We now have three fedmsg-hub consumer implementations:
1. OpenQAConsumer (lives in openqa_fedora_tools) 2. CheckComposeConsumer (lives in fedora-qa/check-compose) 3. RelvalConsumer (lives in fedora-qa/relvalconsumer)
The first two are running on the openQA server hosts, installing and enabling them is part of the ansible plays:
https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/openqa/... https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/check-c...
The third is running on my web server. This is not for any good reason, it's just because that's where we used to do the same job. It could really run anywhere, we just need to make sure exactly one of its RelvalProductionConsumer instances is running somewhere all the time.
All three listen out for the org.fedoraproject.prod.pungi.compose.status.change fedmsgs for Pungi 4 composes and fire when the status is FINISHED or FINISHED_INCOMPLETE. The first two also listen for the org.fedoraproject.prod.compose.23.cloudimg-staging.done messages for two-week Atomic nightly composes, RelvalConsumer does not, because we do not create validation events for those.
OpenQAConsumer creates openQA jobs for new composes. Right now we do *not* have anything which reports results to the wiki, because we don't have openQA emitting fedmsgs and we can't have a consumer block for two hours waiting for tests to finish. I'm hoping to get openQA emitting fedmsgs soon; if this turns out to be harder than anticipated, we can use the same hack CheckComposeConsumer uses (see below).
CheckComposeConsumer produces the 'compose check' emails. In order to do this, the openQA jobs have to be finished, but we don't have a fedmsg for that. So as a temporary hack, the consumer simply forks off a run of the `check-compose` script, which will wait for the openQA jobs to complete, then send the email. One limitation of this is that if the fedmsg-hub service gets killed or restarted while a check- compose process is sitting there waiting for openQA jobs to finish, the check-compose process gets killed, because it's part of the hub service's cgroup. I haven't found a way to avoid this yet (the attempt in the code only works for process groups, and systemd acts on cgroups). So we can lose mails if we get fedmsg-hub restarts at unfortunate times.
RelvalConsumer creates release validation events (i.e. it makes all the wiki pages and sends out an announcement mail). It does exactly the same job that used to be done by a cron job on the same box which ran `relval nightly --if-needed` every day, at a time we hoped the compose was complete. It uses all the same logic, so it should create events on much the same frequency as before. Note it actually *replaces* relval for this task, it does not use it; it actually uses wikitcms directly.
Right now, it only handles nightlies. This is because we still haven't decided what milestone composes will look like, so I couldn't write the code to handle them. But once we *do* decide that, I intend to enhance the consumer to handle them, and all compose event creation will entirely automated (we won't have to run relval by hand to create milestone events, as we have had up till now).
Longer term I would like to make it so all these things are taskotron tasks, obviously. We just need to work through the process of adding the trigger and resolving the question of how we run the tasks in an environment with the necessary credentials available. -- Adam WilliamsonFedora QA Community MonkeyIRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . nethttp://www.happyassassin.net
qa-devel@lists.fedoraproject.org