DHCPv6 still broken for F17 alpha

Fri Mar 16 18:57:09 UTC 2012

Hi Dan,

* Dan Williams

> On Fri, 2012-03-02 at 14:52 +0100, Tore Anderson wrote:
>
>> That is true, however, if IPv6 completes first, and IPv4 (still running
>> in the background) eventually ends up failing, the *entire connection*
>> will be torn down - including the perfectly working IPv6 connectivity.
>> So the successfully connected state only lasts for about 20 seconds.
> 
> I've gone back and forth on this last week; since it changes the
> default, it would break the case where somebody depends on the current
> behavior, ie that by default IPv4 may not fail.  After this patch is
> applied, a network where IPv6 connectivity is available but broken (or
> where the router sends RAs with private prefixes like fdxx::) and IPv4
> is for some reason also broken, will make NM show "connected" when in
> fact we aren't really.  The new connectivity detection will help that
> somewhat, but we haven't enabled it by default yet for a few reasons.
> 
> I ran into a network when testing this that caused me to think harder
> about this patch.  It's an Actiontec router attached to Comcast (I
> think) but has no upstream IPv6 connectivity.  It sends RAs for the
> fdxx:: address space and NM dutifully picks that up.  So now we've got
> IPv6 connectivity to a "private" prefix that's not routable.  If, in
> this case, the router's DHCP server died, which sometimes happens on
> crappy consumer hardware, an upgraded NM would report connected while
> old NMs would fail the connection.
> 
> Whether we care enough about this regression (if you want to call it
> that) versus enabling default IPv6 connectivity I don't know, I tend to
> think we suck up the regression.  But I'm still interested in the
> failure cases.

So what you have here is a double failure, of sorts. First, your DHCPv4
service is broken, and second, your IPv6 service is broken too, but in a
way that doesn't stop RAs. You'd like the connection activation to fail
in this case. But what do you really accomplish? The systray icon will
say "not connected", which may be somewhat useful, but on the other
hand, by allowing it to say "connected" instead, the user is really not
any worse off - his browser (or whatever application will give him error
messages in both cases, and he won't get to do what he wants to do.

Besides, conceptually, this error isn't any different from another one
that doesn't involve IPv6 at all. Let's say that the DHCPv4 server in
your home gateway router works beautifully, but that its upstream
DOCSIS/DSL/fibre/whatever link doesn't. (Having myself been on DSL for a
number of years, I'll be damned if this is not something that happens
*way* more frequently than the scenario you outlined above.) If you want
to be consistent in not activating the network connection when internet
connectivity doesn't work, you'd have to make sure the connection fails
in this case too, right? But you don't, you allow the connection to
succeed without the internet connectivity working. Which leaves the user
in the exact same place as the guy behind the Actiontec. It's been like
this for, like, forever. And somehow, the sky hasn't fallen yet. :-)

However, if you turn it around, the guy behind the Actiontec with the
defective DHCPv4 server might actually have working IPv6 connectivity,
or at least he will have soon - Comcast is one of the leading ISPs in
the world when it comes to IPv6 deployment. Do you really want to leave
him without *any* connectivity in this case, or is it better to leave
IPv4 failed but IPv6 working? Remember, with the entire connection
failed, he can't get anywhere at all. With IPv6 still working, he'll be
able to get to Google and try to find out what is going on, he'll
probably be able to get to Comcast's customer portal to request
assistance, or simply to hit Facebook to kill some time. That has got to
be much better than having no connectivity at all, agreed?

And, finally, that IPv6-only networks are a perfectly valid
configuration is undisputable. Requiring IPv4 breaks those, too.

The way I see it, what you gain by not allowing IPv4 to fail is
providing the user with more clear error message in the case of a very
narrow failure scenario. You don't actually fix or work around the
problem in any way. Furthermore, you break other valid configurations,
and aggravate other narrow failure scenarios. It's clearly not worth it,
in my opinion.

I know my patch is already in NM git, so I just wanted to send you this
message mostly so you can sleep easy at night - convince you it was the
right thing to do. :-) Thank you again!

> Next up, since AFAIK fdxx:: is a non-routable private network (like 10/8
> right?) should NM say that we're only connected to a site-local network
> here?  That would at least help the situation above, and indicate that
> something went wrong instead of NM saying we're connected to the
> internet and nothing working.

Yes and no. On their own, Unique Local Addresses (fd00::/7) addresses
needs NAT66, proxies, or something along those lines to get the user on
the internet. On the other hand, ULAs might also be used in parallel to
public addresses handed out by ISPs. This scenario is described in RFC
6204. The idea is that while the global addresses from the ISP may
change from time to time, just like your global IPv4 address does, the
ULAs will remain stable. That allows your manually-configured network
printer to keep working after you change ISPs, for example. So if you
want to look at ULAs to determine if the connectivity is site-local or
not, you'll also have to ascertain whether or not there's any other
global addresses present.

Some sort of a connectivity check might in any case be useful, though.
MS Windows does this, only passively. It doesn't say you're connected to
the internet before you actually got some traffic from there (e.g. by
visiting a web site).

Best regards,
-- 
Tore Anderson

DHCPv6 *still* broken for F17 alpha

DHCPv6 still broken for F17 alpha