Hi,
I dumped my brain here:
https://github.com/cockpit-project/cockpit/wiki/Bootstrapping-a-network-of-C...
I very much need to talk to Andreas and Stef about this, obviously, since they have done a lot of thinking already.
Nevertheless, I'll make some "issues" for the work items and start working on this.
Marius Vollmer marius.vollmer@redhat.com writes:
I very much need to talk to Andreas and Stef about this, obviously, since they have done a lot of thinking already.
Ok, after discussion with Stef, I have pretty much rewritten the whole thing.
Major changes:
- No difference between "management server" and "managed system" anymore.
- OpenSLP is a requirement; it's the only way to configure what's on the dashboard.
- Initial Setup is not addressed anymore.
https://github.com/cockpit-project/cockpit/wiki/Bootstrapping-a-network-of-C...
Marius Vollmer marius.vollmer@redhat.com writes:
- OpenSLP is a requirement; it's the only way to configure what's on the dashboard.
Sorry, I have to whine a bit. OpenSLP doesn't seem to be reliable enough, due to no fault of its own.
In a straighforward libvirt virtual network with NAT, replies to multicast or broadcast packets get lost since they arrive with a NATed but non-working source address.
Also, NetworkManager sets the broadcast address in my virtual machines to 0.0.0.0 although it shows the right one in the DHCP config on D-Bus. Ifup does it right.
Also also, one needs to explicitly open the firewall for replies to multicast or broadcast packets, since they are not considered to be part of an established connection (I think).
Also also also, OpenSLP says to use net.slp.isBroadCastOnly but that's for slpd, slptool needs net.slp.useBroadcast, which doesn't seem to be documented.
Also also also also, OpenSLP uses CFLAGS to pass in -DLINUX which I clobbered when compiling with CFLAGS=-g, and that magically made broadcast work because it now used a hard coded address of 255.255.255.255 instead of using the bogus 0.0.0.0 reported by the interface.
I am afraid we just can't rely on multicast or broadcast working. I hope that we can at least rely on using OpenSLP with unicast (to retrieve information about a machine once the user has typed in its address), but I am not sure whether that will work without turning every machine into a SLP "Directory Agent", which is probably not a good idea.
So we might need to invent our own advertisement mechanism, and only use OpenSLP to discover IP addresses.
We still need to find a way to test this, though, and the most straightforward thing would be to disable NAT during a test case run. This also means that no test case can access the Internet, which it shouldn't do anyway, so that might be OK.
(I hate software. :-)
Marius Vollmer marius.vollmer@redhat.com writes:
Sorry, I have to whine a bit.
One more: slpd is started too early, doesn't find a multicast route, tries to create one, fails, and refuses to start. Or something like that. Starting it later when the network is up works perfectly.
I also think I found a workaround for the multicast/NAT bug...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/29/2013 04:18 AM, Marius Vollmer wrote:
Marius Vollmer marius.vollmer@redhat.com writes:
Sorry, I have to whine a bit.
One more: slpd is started too early, doesn't find a multicast route, tries to create one, fails, and refuses to start. Or something like that. Starting it later when the network is up works perfectly.
I also think I found a workaround for the multicast/NAT bug...
This, at least, we can correct by updating the unit file to be After=network.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 02.12.2013 13:54, Stephen Gallagher wrote:
On 11/29/2013 04:18 AM, Marius Vollmer wrote:
Marius Vollmer marius.vollmer@redhat.com writes:
Sorry, I have to whine a bit.
One more: slpd is started too early, doesn't find a multicast route, tries to create one, fails, and refuses to start. Or something like that. Starting it later when the network is up works perfectly.
I also think I found a workaround for the multicast/NAT bug...
This, at least, we can correct by updating the unit file to be After=network.
Are you sure about that?
http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
Long running daemons/services should react to changes in networking, not wait for the 'network' to come up. As the above link explains, that's a completely undefined thing to wait for.
Cheers,
Stef
cockpit-devel@lists.fedorahosted.org