So those of you who watch the openQA results might have noticed that
for the last month or so, there have been lots of problems with the
'chained' tests - the _base_ tests that run for various of the media
after default_install has been run, and use a hard disk snapshot
uploaded by default_install.
I think I've made some progress on this. I've made one change, and
proposed another. The first change: I made it possible to configure
openQA's total asset size limit in the Ansible plays for deploying
openQA, and set the limit for our deployments to 300GB (the default is
(plus a couple of follow-up commits). That's the value that openQA uses
for cleaning up old assets: when there's more than that amount of
assets, it wipes some. What I suspect was happening there is that since
we already have more than 100GB of assets all the time (...depending on
exactly how you count, there's some subtlety there) gru was sometimes
wiping uploaded disk images as part of the 'remove old assets' task
before they could be used. This, I think, accounts for the cases where
the chained tests did not run at all, and reported 'setup failure'.
The second change: https://phab.qadevel.cloud.fedoraproject.org/D787
What that does is make it so that tests which upload a disk image try
to shut down the VM cleanly before doing so. Without that change, we
were actually uploading the disk image file while the virtual machine
was still running, I think. I believe that was the cause of the cases
where the Server DVD post-install tests ran, but all failed due to the
system failing to reach a login prompt at all.
I'm hoping with D787 reviewed and applied, almost all openQA tests
should be working again. We're left only with the i386 kernel problem
and whatever's going wrong with server_kickstart_hdd (I don't *think*
that's a corrupt image problem like server_updates_img_local was, I
think it may be a genuine bug).
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net