Hey folks!
I have a few questions about the final deployment of the FMN replacement:
- There's been a request to handle a user being disabled in IPA, which should trigger their rules being disabled (FMN#826). We can do that but we have questions about re-enablement: should the rules be auto-enabled when the user is re-enabled? What about rules that the user may have disabled previously? Should we just leave things disabled?
- How do you see the transition to the new system? We were thinking: - move the current FMN to a different URL, such as notifications-old.fp.o. It will still be processing messages and sending notifications - run the new system in notifications.fp.o (in place of the old) - add a small banner to the new system to point people to the old in case they want to change their rules there Is it too quick? Should we deploy the new one to a notifications-new URL first? That's one more step to get to the final setup, so more work for you. You get to decide :-) → Michal & Kevin are fine with this plan.
- IRC account: we can't connect to libera.chat with the same account twice, we'll need a second account. I can create it, but do you prefer that we: 1. run the old FMN on the new account (fedora-notifs-old) and the new FMN on the usual account 2. run the old FMN on the usual account and the new FMN on the new account (fedora-notifs-new) and then switch back to the usual one when we retire the old FMN 3. run the old FMN on the usual account and the new FMN on a new account (fedora-fmn) and just drop the old account when we retire the old FMN (I guess that's connected to the URL transition issue) → Kevin prefers option 3.
- Let's do an "ask us anything" session for infra people. We can do that on IRC or you can ask all your questions in this email thread.
- When should we do the switch? → Kevin suggests next week, with an email to devel-announce.
What do you think? I've had a few answers from Michal & Kevin already and added them inline.
Kevin also notes that we may need a RHIT ticket to allow IRC out from OpenShift. I'll test it as soon as I've created the new IRC account. He also suggests a sunset date for the old FMN after the F39 release.
Thanks!
Aurélien
Oh yeah one more thing:
- How do you see the transition to the new system? We were thinking:
- move the current FMN to a different URL, such as
notifications-old.fp.o. It will still be processing messages and sending notifications
- run the new system in notifications.fp.o (in place of the old)
The current FMN actually lives at apps.fp.o/notifications. We should probably switch to notifications.fp.o, no? If you agree, then we'll deploy to notifications.fp.o, move the old one to apps.fp.o/notifications-old, and setup a redirect from apps.fp.o/notifications to notifications.fp.o.
A.
On Wed, Apr 19, 2023 at 7:55 PM Aurelien Bompard abompard@fedoraproject.org wrote:
Oh yeah one more thing:
- How do you see the transition to the new system? We were thinking:
- move the current FMN to a different URL, such as
notifications-old.fp.o. It will still be processing messages and sending notifications
- run the new system in notifications.fp.o (in place of the old)
The current FMN actually lives at apps.fp.o/notifications. We should probably switch to notifications.fp.o, no?
a big +1 from me to moving to notifications.fp.o
cheers, ryanlerch
If you agree, then we'll deploy to notifications.fp.o, move the old one to apps.fp.o/notifications-old, and setup a redirect from apps.fp.o/notifications to notifications.fp.o.
A. _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Wed, 19 Apr 2023 at 02:56, Aurelien Bompard abompard@fedoraproject.org wrote:
Hey folks!
I have a few questions about the final deployment of the FMN replacement:
- There's been a request to handle a user being disabled in IPA, which
should trigger their rules being disabled (FMN#826). We can do that but we have questions about re-enablement: should the rules be auto-enabled when the user is re-enabled? What about rules that the user may have disabled previously? Should we just leave things disabled?
- How do you see the transition to the new system? We were thinking:
- move the current FMN to a different URL, such as
notifications-old.fp.o. It will still be processing messages and sending notifications
- run the new system in notifications.fp.o (in place of the old)
- add a small banner to the new system to point people to the old in
case they want to change their rules there Is it too quick? Should we deploy the new one to a notifications-new URL first? That's one more step to get to the final setup, so more work for you. You get to decide :-) → Michal & Kevin are fine with this plan.
I was going to say that one thing you need to 'add' is announcing this plan of changes to devel and users mailing lists and the equivalent discourse at least 3 times. 1. Tell them what you are going to do ( a couple of days to a week in advance). a. What the new URLs will be b. How to open tickets if you have issues with the new tool c. How to change settings etc. 2. Tell them you are doing it. (on the day it rolls out) 3. Tell them what you did. (a week after you did the change).
Most of the complaints will be that you never told people or they weren't informed. Doing it repeatedly like this allows for others to forward at least one (if not all three) to the various people who will say so. Going from the numbers of bounced emails we get at bastion, there are a ton of people who set it up in firehose fashion to some email address and then send all that to spam afterwards. They keep some filter locally for the messages they want and when it changes they will come and complain about it not working anymore.
- IRC account: we can't connect to libera.chat with the same account
twice, we'll need a second account. I can create it, but do you prefer that we:
- run the old FMN on the new account (fedora-notifs-old) and the
new FMN on the usual account 2. run the old FMN on the usual account and the new FMN on the new account (fedora-notifs-new) and then switch back to the usual one when we retire the old FMN 3. run the old FMN on the usual account and the new FMN on a new account (fedora-fmn) and just drop the old account when we retire the old FMN (I guess that's connected to the URL transition issue) → Kevin prefers option 3.
- Let's do an "ask us anything" session for infra people. We can do
that on IRC or you can ask all your questions in this email thread.
- When should we do the switch?
→ Kevin suggests next week, with an email to devel-announce.
What do you think? I've had a few answers from Michal & Kevin already and added them inline.
Kevin also notes that we may need a RHIT ticket to allow IRC out from OpenShift. I'll test it as soon as I've created the new IRC account. He also suggests a sunset date for the old FMN after the F39 release.
Thanks!
Aurélien _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Wed, Apr 19, 2023 at 08:55:58AM +0200, Aurelien Bompard wrote:
Hey folks!
I have a few questions about the final deployment of the FMN replacement:
- There's been a request to handle a user being disabled in IPA, which
should trigger their rules being disabled (FMN#826). We can do that but we have questions about re-enablement: should the rules be auto-enabled when the user is re-enabled? What about rules that the user may have disabled previously? Should we just leave things disabled?
I think this is a pretty rare case. I guess I would expect to just leave them all disabled and let the user go enable them again once they can log back in?
- How do you see the transition to the new system? We were thinking:
- move the current FMN to a different URL, such as
notifications-old.fp.o. It will still be processing messages and sending notifications
- run the new system in notifications.fp.o (in place of the old)
- add a small banner to the new system to point people to the old in
case they want to change their rules there Is it too quick? Should we deploy the new one to a notifications-new URL first? That's one more step to get to the final setup, so more work for you. You get to decide :-) → Michal & Kevin are fine with this plan.
yep. I think smooges suggestion downthread to announce the plan is good.
and schedule a time to roll out the new and move the old so people know to expect it.
- IRC account: we can't connect to libera.chat with the same account
twice, we'll need a second account. I can create it, but do you prefer that we:
- run the old FMN on the new account (fedora-notifs-old) and the
new FMN on the usual account 2. run the old FMN on the usual account and the new FMN on the new account (fedora-notifs-new) and then switch back to the usual one when we retire the old FMN 3. run the old FMN on the usual account and the new FMN on a new account (fedora-fmn) and just drop the old account when we retire the old FMN (I guess that's connected to the URL transition issue) → Kevin prefers option 3.
yep. since people will be setting things up new it makes sense to me to have a new user messaging you. Also less swapping around, its just setup and done.
- Let's do an "ask us anything" session for infra people. We can do
that on IRC or you can ask all your questions in this email thread.
+1
- When should we do the switch?
→ Kevin suggests next week, with an email to devel-announce.
What do you think? I've had a few answers from Michal & Kevin already and added them inline.
Kevin also notes that we may need a RHIT ticket to allow IRC out from OpenShift. I'll test it as soon as I've created the new IRC account. He also suggests a sunset date for the old FMN after the F39 release.
yep!
thanks!
kevin
Oh, one more thing.
fmn has been alerting a lot. I am pretty sure from your description this is because when it's rebuilding it's cache it doesn't process anything.
Apr 19 00:20:39 <zodbot> Aurélien Bompard - ansible.git:ad66644567 [*roles/openshift-apps/fmn/templates/deploymentconfig.yml] FMN: run the IRC sender now Apr 19 00:40:56 <zodbot> PROBLEM - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is CRITICAL: RABBITMQ_QUEUE CRITICAL - messages CRITICAL (830), messages_ready OK (805) messages_unacknowledged OK (25) consumers OK (1) (noc01) Apr 19 01:00:57 <zodbot> PROBLEM - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is CRITICAL: RABBITMQ_QUEUE CRITICAL - messages CRITICAL (3052), messages_ready OK (3027) messages_unacknowledged OK (25) consumers OK (1) (noc01) Apr 19 02:00:57 <zodbot> PROBLEM - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is CRITICAL: RABBITMQ_QUEUE CRITICAL - messages CRITICAL (9903), messages_ready OK (9878) messages_unacknowledged OK (25) consumers OK (1) (noc01) Apr 19 03:00:57 <zodbot> PROBLEM - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is CRITICAL: RABBITMQ_QUEUE CRITICAL - messages CRITICAL (19733), messages_ready OK (19708) messages_unacknowledged OK (25) consumers OK (1) (noc01) Apr 19 04:00:57 <zodbot> PROBLEM - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is CRITICAL: RABBITMQ_QUEUE CRITICAL - messages CRITICAL (28472), messages_ready OK (28447) messages_unacknowledged OK (25) consumers OK (1) (noc01) Apr 19 05:00:57 <zodbot> PROBLEM - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is CRITICAL: RABBITMQ_QUEUE CRITICAL - messages CRITICAL (36369), messages_ready OK (36344) messages_unacknowledged OK (25) consumers OK (1) (noc01) Apr 19 05:40:57 <zodbot> RECOVERY - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is OK: RABBITMQ_QUEUE OK - messages OK (1) messages_ready OK (0) messages_unacknowledged OK (1) consumers OK (1) All queues under the thresholds (noc01) Apr 19 13:42:58 <zodbot> PROBLEM - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is WARNING: RABBITMQ_QUEUE WARNING - messages WARNING (282), messages_ready OK (257) messages_unacknowledged OK (25) consumers OK (1) (noc01) Apr 19 13:52:58 <zodbot> PROBLEM - rabbitmq01.iad2.fedoraproject.org/Check queue fmn is CRITICAL: RABBITMQ_QUEUE CRITICAL - messages CRITICAL (1257), messages_ready OK (1232) messages_unacknowledged OK (25) consumers OK (1) (noc01)
So, we should figure out a way to not do this. ;)
We could just bump the check up to like 40000 or something?
Or you all could revisit just continuing to process while rebuilding the cache?
Or something. :)
kevin
I was going to say that one thing you need to 'add' is announcing this plan of changes to devel and users mailing lists and the equivalent discourse at least 3 times.
Very true, thanks for the suggestion, I would not have communicated enough. Sadly, people don't like surprises.
I guess I would expect to just leave them all disabled and let the user go enable them again once they can log back in?
Noted, thanks.
schedule a time to roll out the new and move the old so people know to expect it.
How about next week? Any preferred date besides Friday?
fmn has been alerting a lot. I am pretty sure from your description this is because when it's rebuilding it's cache it doesn't process anything.
That's very likely indeed.
We could just bump the check up to like 40000 or something? Or you all could revisit just continuing to process while rebuilding the cache?
OK I bit the bullet and changed the way this situation is handled. FMN will now continue processing messages and rebuild the cache in the background. As a result, the changes will be effective when the cache rebuild is done. I think that's indeed the best way forward, and I should have done that initially. Anyway. The change is running in production already, and it seems to work as expected. The queue should not pile up anymore. I'm also now storing in the cache DB the time it took to rebuild the main caches, so we can have an estimation of what's going on. It's not very accessible at the moment but the data is there.
Aurélien
infrastructure@lists.fedoraproject.org