Hi all,
This is a newbie question with respect to FreeIPA, and I haven't seen this
elsewhere, so I thought I'd ask.
I've just cleaned up an issue with trying to implement a new replica on our
domain, and I've realized that there are a couple of areas I don't
understand that are causing more stress than I need at this point, for I am
maximally paranoid and overworked, and I need some advice on how not to
make that worse when implementing FreeIPA.
I've been reading this list on and off for most of the last year, and
what's struck me is how complicated this project is, especially of course
areas where I personally have less expertise (e.g., LDAP, Kerberos).
Implementing and managing this is just one of my jobs, and our IdM becomes
more critical as time goes on, so I need to understand a few things.
Question 1: As I just had, for the first time, to manually modify LDAP to
remove data, I'd like to understand how taking that approach can backfire.
In other words, it's clear this isn't habitually a good idea, because
mistakes will replicate, for example. But, where are the real danger
points: for example, I've seen stories of having to recover a server in an
environment where the time was intentionally set back to allow an operation
on an expired cert. As with a database update that triggers other
(unexpected) changes, are there LDAP operations that can't practically be
"un-done"? I know that deleting records is permanent (obvious), but are
there gotchas like changing a particular object fires some large number of
events that there's no way to revert? Or, in colloquial terms, are there
places that are really, really bad to try to outwit the management
interfaces? I'm not talking everyday updates, but instances where we get
stuck (as I was yesterday and today)?
Question 2: How to stay safe. Our installation on a small network as a
pair of masters replicating with each other, CA and DNS installed on both.
They are VMs allocated on separate physical hosts. Before updates, I
snapshot both for recovery purposes. I update one at a time, now, and
ensure that they're functional before doing the other one (I know that
schema changes will propagate before service is applied elsewhere, but I
can't do anything else about that). I haven't had to deal with the question
of how to make sure my (self-signed) CA certificate doesn't expire, but I
know when it will and am leaving myself ample time to understand that. I'm
about to start pushing out this functionality to multiple geos, again at
small-ish scale, and I'm reading the topology references to be sure I
understand what I'm doing. But: what am I missing? I get the impression
that trying to (conventionally) "back up" data isn't useful. I've tried to
design the network to make it as simple as possible to meet our needs.
Does anyone else have anything to add in the sense of "best practices", or
to echo my first question "there be dragons <here>"?
I've learned a lot from this list, and I'd also like to add a thank you to
everyone here who have helped with that! I do see this getting better and
better, and it's appreciated.