On Mon, Jun 15, 2020 at 6:54 PM Kevin Fenzi <kevin(a)scrye.com> wrote:
Greetings everyone.
As you hopefully know, we took down our old staging env in phx2 as part
of the datacenter move. Once machines from phx2 are shipped to iad2 and
racked and installed and setup we can look at re-enabling our staging
env.
However, I'd like to ask everyone about how we want that to look.
Some questions:
* Before we had a staging openshift with staging applications in it.
This is sort of not how openshift is designed to work. In the ideal
openshift world you don't need staging, you just have enough tests and
CI and gradual rollout of new versions so everything just works.
Granted a staging openshift cluster is useful to ops folks to test
upgrades and try out things, and it's useful for developers in our case
to get all the parts setup right in ansible to deploy their application.
So, what do you think? should we setup a staging cluster as before?
Or shall we try and just use the one production cluster for staging and
prod?
I think having a staging OpenShift would still be useful, because we
probably still have cases where we may want to test a mixture of
cluster and app changes together, and that'd be difficult to do in the
production OpenShift system.
* Another question is openshift 4. Openshift 3.11 is supported until
june of 2022, so we have some time, but do we want to or need to look at
moving to openshift 4 for our clusters? One thing I hate about this is
that you must have 3 master nodes, and the only machines we have are big
powerfull virthost servers, so it's very wastefull of resources to
deploy a openshift 4 cluster (with the machines we have currently
anyhow).
You could make your OpenShift master nodes be able to schedule
workloads, so it's less wasteful. I think it'd make a ton of sense to
look at OpenShift/OKD 4 with this move. There *are* differences with
OpenShift 4, and we will likely have to do some significant
adaptations to our deployment code for it, but it would be worth it
for the simpler maintenance of the cluster.
* In our old staging env we had a subset of things. Some of them we
used
the staging instances all the time, others we almost never did. I'm not
sure we have the resources to deploy a 100% copy of our prod env, but
assuming we did, where should we shoot for on the line? ie, between 100%
duplicate of prod or nothing?
* We came up with a pretty elaborate koji sync from prod->staging. There
were lots of reasons we got to that, but I suppose if someone wants to
propose another method of doing this we could revisit that.
* Any other things we definitely want from a staging env?
I think that as long as we have a relatively close replica of our prod
infrastructure, it will be useful for being able to experiment with
improvements to our infra in a safe way.
* Has staging been helpful to you?
Staging has been immensely helpful for the things I work with (Package
infra, Pagure, etc.), it's helped us with testing new releases, new
changes, etc. and catching bugs before we make releases. I hope we
continue to have it.
* Is there anything we could do to make it better?
I don't particularly have a lot of access, which is kind of
frustrating. But that's a problem specific to me rather than something
wrong with staging infra env. :)
--
真実はいつも一つ!/ Always, there's only one truth!