On 05/15/2018 08:12 PM, Zach Villers wrote:
Not what folks are asking, but you can pick up a lot just by hanging out and watching the
on-call person mutter to themselves in IRC ( not sure that is the right phrase umm... ) I
don't know if this will translate for everyone, but there is a concept of "rubber
duck" problem solving, where if you have a particularly difficult issue, and you
explain it to someone, it helps you solve the problem more easily. I don't know if
this is how everyone works and it really doesn't help if someone is jumping up and
down and quacking while your are trying to think. I guess my point is, just hanging out
unobtrusively when you can is fairly helpful all around.
Yeah!
https://en.wikipedia.org/wiki/Rubber_duck_debugging
But yes, we already do talk about what we are doing and whats going
wrong and what the fix might be in IRC. Everyone welcome to watch/ask
questions.
If I recall, alerts are pretty easily accessible. You can poke around
on Nagios if there are issues. Obviously if everything is down/red, it's not a good
time to ask for help with your ssh access.
Indeed. Yep. Nagios is pretty available to all.
I can try and make sure I note exactly what I am doing to clear an
alert... sometimes I am bad about saying "fixing that" or "poking
that"
without saying what exactly is going on.
A couple ideas;
- stream your terminal session when working an outage ( could be hard to find a 100%
foss version that is secure )
Yeah. ;( There are things like tmux that might make this possible.
However, we do want to make sure someone else cannot take control of our
sessions. :)
I did look for some screencast type software for command line a while
back and was disappointed that all of them needed some non free or
website to view/decode. ;( I guess there is always 'typescript'
- plan some outages in stage for apprentices to work at some time
when tickets are low and nothing urgent is planned ( I don't know that I've ever
heard of such a time, but in theory it could exist )
Ha. Yeah, thats a nice idea.
One thing I would very much like to do is move all the *stg* services to
a noc01.stg instance that doesn't page, just irc and non urgent email.
Once thats seperate we could indeed try and run some alerts. :)
Of course, it's late here, so this may all turn out to be nonsense, but good
discussion anyway.
No no, it was great.
I think discussing this is good... and hopefully we can get folks more
involved (however that happens).
kevin