tl;dr: On Sunday 23rd February, there will be Copr outage. It will last the whole day.
PPC64LE builder and chroots will be deactivated. The PPC64LE builders should be back in a
matter of weeks.
Hi.
As previously announced, Fedora's infrastructure is moving to a different datacenter.
For some servers, the move is
trivial. Copr servers are different. Copr build system consists of four servers, plus four
staging servers. Eight TB of
repos, four TB of dist-git, and several small volumes.
The original plan was to move to Washington D.C. to IAD2 datacenter by June. Copr is
running in Fedora OpenStack, and
this cloud has to be evacuated by the beginning of March to free an IP range.
The plan was to move Copr to new hardware (thanks to Red Hat) and later move this HW to
the new datacenter. That would
mean two outages, where the second one lasted at least 15 days (!).
We were looking for another option and we found it. We are going to move Copr to Amazon
AWS, shutdown old VM on Fedora
Cloud. Move the new HW to IAD2 datacenter and then move Copr from AWS to new HW in IAD2 -
FYI, the final destination is
still subject to change. This still means two outages, but they should be just a few
hours. And web server with DNF
repositories should be available all the time.
The second outage, will happen in May or June.
Here is a detailed schedule. We are going to update this table during migration. You can
watch the progress during
migration:
https://docs.google.com/spreadsheets/d/1jrCgdhseZwi91CTRlo9Y5DNwfl9VHoZfj...
Here is a short abstract:
* we are doing constant rsync to the new location
* we spin up staging and production instances in the new location
* on Sunday morning we stop frontend and therefore accepting new jobs. The backend with
DNF repos will still be
operational.
* we do final rsync (~6 hours)
* around 13:00 UTC we switch DNS to the new location
* we then enable all services
* once we confirm that everything is operational, the outage will be over
There are several caveats:
* After we enable services on Sunday 13:00 UTC you may see some failures. Be asured that
we will swiftly address them.
* Once we get out of Fedora Cloud, we lost access to PPC64LE builders. We are going to
deactivate those chroots just
before the migration. After a few weeks, we should get it back. ETA is unknown. The
worst-case scenario is in June 2020.
We will be aiming to bring it back as soon as possible.
* Any small issue can easily change the schedule by hours. E.g., just simple 'chown
-R' on backend runs ~4 hours.
There are going to be three Copr engineers and one fedora-infrastructure member available
whole Sunday. If you
experienced a problem, do not hesitate to contact us. We are on #fedora-buildsys on
Freenode.
The link to the outage ticket is:
https://pagure.io/fedora-infrastructure/issue/8668
--
Miroslav Suchy, RHCARed Hat, Associate Manager ABRT/Copr, #brno, #fedora-buildsys