On 7/6/22 04:37, Lennart Poettering wrote:
On Di, 05.07.22 22:35, Dusty Mabe (dusty(a)dustymabe.com) wrote:
> On 6/25/22 15:06, Vipul Siddharth wrote:
>> This document represents a proposed Change. As part of the Changes
>> process, proposals are publicly announced in order to receive
>> community feedback. This proposal will only be implemented if approved
>> by the Fedora Engineering Steering Committee.
>>
>>
>> == Summary ==
>>
>> The `systemd-udev` package installs
>> `"/usr/lib/systemd/network/99-default.link"`,
>> which sets `Link.MACAddressPolicy=persistent`. This proposal is to
>> change it to set `Link.MACAddressPolicy=none` to stop changing the MAC address.
>> This is particularly important for bridge and bond devices.
>>
>> This change can either only apply to bridge/bond devices, or to
>> various software devices. That is to be discussed.
>
> Based on the feedback here on the list we have modified the scope of this proposal
> to now be limited to changing the MACAddressPolicy=none for bond/bridge/team devices
only.
>
> New summary:
>
> ```
> The systemd-udev package installs
"/usr/lib/systemd/network/99-default.link", which sets
Link.MACAddressPolicy=persistent for all software NIC devices. This proposal is to add to
the policy so that we use Link.MACAddressPolicy=none for bond/bridge/team devices.
> ```
>
>
https://fedoraproject.org/wiki/Changes/MAC_Address_Policy_none
So, with changing this back you'll also break all those setups where
it is assumed the bridge MAC just works and is stable and independent
from the devices added to it and the order in which they are added in.
One thing we do need to think about here is the "upgrade" case. We could
consider leaving existing (upgrading) systems alone.
What makes you so sure, that changing this *again* is so much better
than just leaving it the way it is now? This change isn't precisely
new, and changing stuff forth and back comes at quite a price.
Indeed it does come at a price, which is why we're discussing the merits
here and also why we scaled back the proposal to just cover the use case
where it matters the most. For example, this kernel documentation is a good
place where a user is now confused because the kernel behaves one way for
bonds, but systemd delivers configuration to make it behave differently:
https://wiki.linuxfoundation.org/networking/bonding#where_does_a_bonding_...
If you look at the longer term landscape (i.e. years from now) the change
is worth it if it's the right change to make. I think the fact that RHEL9
didn't pick up the original change is an indication of it's value in enterprise
environments where bond/bridge/team devices are more common.
Generally, if we change behaviour like this, then the pros must
seriously outweight the cons. Now you might argue that when the
original change was made that wasn't the case, that's a valid opinion,
though one I disagree with. But that has no effect on today, on the
question whether changing it again is worth it. I am pretty sure that
there are *also* numerous scripts that benefit from the predictability
and stability you get with the status quo, and which you'll break if
you revert to the old state – again.
There is a lot that happens upstream systemd that we should more carefully
consider in Fedora. For example this probably should have been proposed to
Fedora as a change then (by systemd maintainer in Fedora I suspect) and the
merits examined at that point in time.
I am not convinced we should flip flop on this all the time. Yes, it's
unfortunate that people tripped over this, but you are not really
making it better if you then break it *again*, given the benefit is
unclear, and the major software creating bridges/bonds/… doesn't care
anyway (i.e. NM, networkd, …).
This definitely affects NetworkManager and people do care.. See
https://github.com/coreos/fedora-coreos-tracker/issues/919
https://github.com/systemd/systemd/issues/15208
I for one do not see where you'd get the crystal ball to look into to
know that by changing this *again* you'll make bazillions of people
happy, and only very few people sad, because you break their stuff. It
might very well be that you'll make more people sad because you break
their stuff again, than were happy before.
The discussion here is one way to gather feedback. There is also the collective
expertise of the members of FESCO, who make decisions like this all the time.
Also, let's not forget that allowing users to set the policy is a good
thing, we should let them. Given that the original change was already
made a lot of software that cannot work with such changes has already
been updated to override the default policy. That's a good thing,
since the overall system becomes more robust and people can more
safely change the policies locally, with less breakage to expect.
if you now revert to the status quo ante, then you basically also say:
fuck it, we don't care that software is fixed to work with changed
policies, let's keep things brittle that you cannot change policies
anymore effectively because we are sticking our head in the sand and
don't care that they are fixed.
I don't think that accurately reflects what we are saying here. I'm not
sure why making a change here makes things more brittle. Maybe it depends
on the implementation?
So, I am not convinced.
I can accept that this wasn't handled particularly nicely
originally. My proposal for addressing this is via documentation, i.e
update the udev and iproute docs to explicitly point to the issue and
how to deal with it, with an example drop-in to make it easy to deal
with it.