F23 System Wide Change: Default Local DNS Resolver

Thu Jun 18 16:14:49 UTC 2015

On Fri, 2015-06-12 at 14:32 -0400, Paul Wouters wrote:
> On Fri, 12 Jun 2015, Dan Williams wrote:
> 
> >> That is why HTTP redirection and DNS failure have to be detected by
> >> whatever is the "hot spot detector". Both items weigh in on triggering
> >> a hotspot logon window.
> >
> > Agreed.  But how does the DNS failure actually get relayed to the thing
> > doing the HTTP request, when unbound + DNSSEC is involved?  That's one
> > point I'm very unclear on.
> 
> In hotspot mode (dnssec-trigger's version of hotspot mode)
> /etc/resolv.conf contains the DHCP supplied DNS servers. Those are used
> to determine both the "DNS cleanliness" state, and is also used to fetch
> the fedoraproject hot spot detection page. The unbound DNS server, while
> running, is not used at all for anything, as resolv.conf does not point
> to it. Unfortunately, because this is not isolated to dnssec-triggerd,
> all applications doing DNS during this time get crap/dangerous DNS
> resolves, leading to add the bad certificate warning popups. And why I
> was hoping to isolate that with either a network namespace, or other
> solution that prevents us from requiring to affect the whole system
> by changing resolv.conf.
> 
> If selecting "cache only", then resolv.conf points to 127.0.0.1 and
> unbound is configured with a "DNS forwarder" for everything set to
> 127.0.0.127 so no DNS lookups ever leave the host.
> 
> >>> 1. NM connects to a new network
> >>> 2. NM updates DNS information
> >>
> >> I don't know what 2) means. If it means rewriting /etc/resolv.conf or
> >> the unbound forwarder configuration, we have already lost if the DNS
> >> was malicious (and/or a hotspot DNS)
> >
> > It means whatever "dns" action was set in NM, either writing
> > resolv.conf, not touching anything (dns=none), sending split DNS to
> > unbound (dns=unbound), or to dnsmasq (dns=dnsmasq), etc.  In this case
> > I'll presume dns=unbound.
> 
> Ahh thanks.
> 
> >> dnssec-trigger currently detects the difference by also checking for an
> >> http hotspot redirect using http://fedoraproject.org/static/hotspot.txt
> >> If no http redirect, then DNS is broken and it tries to work around it
> >> by becoming a full iterative resolver or doing DNS over TCP or DNS over
> >> TLS. and if it all fails, presents the "insecure or cache only" dialog.
> >
> > NM also checks for redirection.
> >
> > Though, what do you mean by "if no HTTP redirect, then DNS is broken"?
> 
> Sorry I meant "If no http redirect, and DNS is broken, then it tries to
> work around by ...". That is, when there is an http redirect, there is
> no point doing anything about DNS because after authenticating to the
> hotspot, DNS might turn out to be either fine or broken for other
> reasons.
> 
> >> 1) NM detects a new nework, but doesn't tell the applications that there
> >>     is network connectivity yet. So firefox won't throw HTTPS warnings
> >>     and pidgin/IM won't throw https warnings. Because as far as they know
> >>     the network is still down.
> >
> > Agreed.  Right now we have "connectivity" states, but they are all
> > determined after the interface is signaled as "connected".  We can do
> > some work here to indicate connectivity status on this interface before
> > indicating to applications that the interface is fully connected.
> 
> That would be awesome!
> 
> >> 2) NM/dnssec-trigger does the HTTP and DNS probing and prompting using
> >>     a dedicated container and any DNS requests in that container are
> >>     thrown away with the container once hotspot has been authenticated.
> >>     This would allow us to never have resolv.conf on the host be
> >>     different from 127.0.0.1. (currently, it needs to put in the hotspot
> >>     DNS servers for the hotspot logon, exposing other applications to
> >>     fake DNS)
> >
> > I'm not sure a container really needs to be involved as long as the DNS
> > resolution can be done without hitting resolv.conf.  That's not hugely
> > hard to do I think
> 
> True. In fact with unbound it is pretty trivial to do. The equivalent
> unbound python code for that would be:
> 
> import unbound
> 
> ctx = unbound.ub_ctx()
> ctx.resolvconf("/this/networks/respresentation/of/resolv.conf")

Hmm, that doesn't really allow for split DNS though since it uses the
resolv.conf format?  Ideally we could just send unbound a list of server
+domain, and then a fallback of server+"*" for anything not matching
that list.

> any resolve calls made will use the non-system resolv.conf's nameserver
> addresses.
> 
> So the hotspot check could be:
> 
> ctx = unbound.ub_ctx()
> ctx.add_ta_file(rootanchor) # DNSSEC root key
> ctx.resolvconf("/this/networks/respresentation/of/resolv.conf")
> status, result = ctx.resolve("fedoraproject.org", unbound.RR_TYPE_A)
> if not result.havedata or not result.secure:
>  	# we're captive because fedoraproject.org is DNSSEC signed and
>  	# we got an error (forged) response
>  	# Redo query with a non-DNSSEC cache to get forged A record to
>  	# authenticate to the hotspot
>  	insecurectx = unbound.ub_ctx()
>  	insecurectx.resolvconf("/this/networks/respresentation/of/resolv.conf")
>  	status, result = insecurectx.resolve("fedoraproject.org", unbound.RR_TYPE_A)
>  	if result.havedata:
>  		addr = result.data.address_list[0]
>  		# give addr to the captive portal logon HTTP engine
>  	insecurectx.ub_close()
> else:
>  	if result.havedata:
>  		# check for HTTP interception - we might still be captive
>  		addr = result.data.address_list[0]
>  		# give addr to the captive portal logon HTTP engine
> ctx.ub_close()

Ok, so as I asked about earlier, the connectivity checking is a two-step
process then:

1) look up the IP address of your connectivity server using DNS, and if
the result is insecure then you know your DNS is hijacked (eg,
hotspot/portal) or your local DNS server is utterly broken

2) if #1 succeeded then continue to the HTTP connectivity check

> Things are a little tricker because the hotspot likely stupidly uses
> even more DNS calls to build up the logon page, so whatever the http
> rendering agent is (eg xdg-open or firefox or whatever) needs to keep
> using this unbound cache and not fall back to the system default one.

Yeah, whatever thing handles the logon (GNOME Shell, etc) would have to
have a method to inject nameservers that the web sub-process would use
exclusively instead of the system namesevers.  Then the process would go
something like this:

0) resolve.conf/unbound/whatever DNS is unpopulated
1) NetworkManager looks up connectivity server IP address
2) NetworkManager does the connectivity check using that IP address
3) assuming either #1 or #2 fails, NM signals HOTSPOT connectivity state
and provides the DNS information via the API, including any DHCP/WISPR
information received from the network
4) Hotspot agent (GNOME Shell, KDE, whatever) sees the HOTSPOT state,
reads the DNS servers, and spawns a sandboxed web browser using the
preliminary DNS servers
5) User completes hotspot logon or rejection, user agent signals that
hotspot operations complete
6) NetworkManager re-does connectivity checks, and assuming the result
is "success", indicates CONNECTED connectivity state

or something like that.  But these steps could be split out into a
small, single-purpose connectivityd that used information from NM/other
sources and was triggered by NM/other sources, and then NM wouldn't have
to do it.  NM could still proxy the state since NM would know when to
trigger connectivity checks anyway.  But I digress.

> > Then once the hotspot login is completed, we must re-do the connectivity
> > check to ensure that we do indeed have access to the full internet.  If
> > we do, then we can finally signal "connected".  If it fails again, then
> > we either show the hotspot login window again, or somehow indicate that
> > hotspot login failed.
> 
> > Note that none of this mentions DNS to the user at all yet...  so what
> > happens if the hotspot login succeeds, we get connectivity to the
> > internet, but the hotspot DNS doesn't support DNSSEC correctly?
> 
> If HTTP is no longer redirected (dnssec-trigger keeps probing while you
> pull your credit card out), it assumes you have successfully authenticated
> to the hotspot. It re-tests the supplied DNS servers. If these are still
>   determined to be too broken for using DNSSEC (eg too old bind,
> dnsmasq) it tries to (silently) become a full itterative nameservers,
> eg it will not use any forwards and do all the DNS work itself. If this
> also fails, for example because the network blocks port 53 to all but
> its own DNS servers,  dnssec-trigger tries the other modes of DNS over
> TCP/SSL. If any of this works the user isn't even consulted. Only when
> all of this fails do we need to contact the user and ask them to go
> "insecure" or "cache only"

This is the part that I feel like unbound should do, or if not unbound
then whatever local caching nameserver we do have.  Perhaps that's
already built into unbound and only controlled by dnssec-trigger, but it
seems like something more integral to the resolver than outside of it?

Dan