default local DNS caching name server

Mon Apr 14 05:49:41 UTC 2014

On Mon, 2014-04-14 at 00:42 -0400, Paul Wouters wrote:
> On Mon, 14 Apr 2014, William Brown wrote:
> 
> > What is a "captivity-sign" as you so put it?
> 
> Check for clean port 80. It fetches the url specified in
> dnssec-triggerd.conf's url: option
> (default http://fedoraproject.org/static/hotspot.txt)
> 
> If it returns a redirect or a page that does not contain the exact text
> "OK" it knows a hotspot has intercepted the page and will prompt the
> user to login to the hotspot. It the user agrees, resolv.conf is filled
> in with the DHCP obtained values and it fires of xdg-open to the page
> http://hotspot-nocache.fedoraproject.org/ which is a special DNS entry
> with TTL=0 so it can never be cached (so we will go through the DNS lies
> that are told about the name)
> 
> When port 80 becomes clean, it is assumed you have "logged on"
> and it then runs various DNS/DNSSEC tests against TLD servers for known
> features and bugs in old DNS software. This will determine if DNS is
> still messed with. If the forwarder shows broken behaviour, it is
> attempted to bypass it as I described before.
> 

This seems like a sane(ish) method of doing this. What happens if the
hotspot page is down? Why not use a mirror-like setup with yum where you
try 2 or 3 mirrors and if they fail then you declare it to be a portal? 

> >> sudo unbound-control forward_add starfish 10.1.2.3.
> >> sudo unbound-control flush starfish
> >> sudo unbound-control flush_requestlist
> >>
> >> When you leave the network, forward_remove is called.
> >>
> >> sudo unbound-control forward_remove starfish
> >> sudo unbound-control flush startfish
> >> sudo unbound-control flush_requestlist
> >
> > Okay, so lets expand this to my workplace, that run's a University
> > network. We have thousands of students connected. Now, we have many
> > zones on our network, from services.university.edu university.edu,
> > medicalcenter.org, ersearch.com etc etc.
> >
> > We can't possibly put all of these into our "domain-name" dhcp option.
> > iirc it's a single value attribute anyway.
> 
> As we indicated, for "trusted" networks (LAN, secure WIFI) a domain of
> "." will be used which means "forward everything". This does NOT mean we
> stop being a recursor. We still recursive because we need tod perform
> DNSSEC validation. We just use the available DNS cache of the local
> network - which also gets us your internal-only domains.

This point was not made clear. Will summarise at the bottom ... 

> 
> Yes. The problem here is the dodgy ISP. If they are dodgy enough,
> unbound will bypass them anyway. If we need to add an NM option for
> "don't use this dodgy ISPs DNS servers" we can also add that.
> 
> > But you can't really tell what's a dodgy DNS and what's not.
> 
> Yes we can. There is both dnssec-trigger and some other software that
> runs various tests for this.

*how* can you tell if it's dodgy. You can tell captivity from above, but
you can't easily see if an ISP is say TTL tampering or data tampering? 

> > Consider also, that some ISP's force all port 53 traffic to their own
> > DNS servers too. How does unbound know when the ISP is forcing this?
> 
> unbound does not really care about transparent proxy's on port 53. As
> long as they don't break DNS (and DNSSEC). If they redirect port 53 to
> some broken DNS server, unbound will try to work around it. If port 53
> is broken it will attempt DNS over port 80 of various fedoraproject DNS
> servers, or DNS over TLS on port 443.

How do you setup DNS over TLS?

Again, how can you guarantee that the fedora infrastructure won't go
down? My devils advocate points out we are adding more reliance on
"third party" infrastructure here. Could it again be a case similar to
the mirrors where you can "become" a fedoraproject DNS node to help load
balance?

> 
> > Essentially, what I'm hearing at the moment, is that the proposal isn't
> > just a caching DNS server: It's a DNS server that will be:
> >
> > * DNSSEC
> > * Caching
> > * Attempts to always bypass my local DNS forwarder.
> 
> I hope I clarified it now that your third bullet point is not the case.

It's not as much the case, which makes me happier, but I want to know
the conditions on which you decide a DNS server is "dodgy" or not. 

> > I'm glad that the NM integration is being considered, that will help. I
> > might not be afraid to touch a CLI, but I do think of users who use the
> > GUI only.
> 
> This is why we did not want to force everyone on dnssec-triggerd. We
> know that solution is not good enough for non-devs.

Agreed.

> 
> > In summary, all I ask is that:
> >
> > * If a forwarder exists on the network, unbound uses it for all queries.
> 
> Yes, but not for open wifi. Only for physical wire and secured wifi.

Okay. Can this point be made clear on the proposal page? Also the
conditions for Physical wire, and secured wifi?

There are also a number of tethering situations that may actually be
mis-interpreted as secure. IE my phone has a WPA2 hotspot with DNS that
goes via 3g. How does unbound treat this? It would only see the secure
wifi ... 

Consider also some wifi hotspots have their own "local" zones that are
needed, so again, I do think that unbound should use the local forwarder
irrespective of network security, because else you may risk breaking
things. Or how would you suggest this is solved. For arguments sake lets
say:

SSID: myawesomeopenhotspot
DHCP provides no domain-name info.
I CNAME all records to my.hotspot. until authenticated. 

Your hotspot test will be triggered, but if unbound won't use the local
forwarder, you won't be able to resolve my.hotspot. on the insecure
wifi. 

> 
> > * If that forwarder returns an invalid signed DNSSEC zone then you
> > bypass it for only that zone. (IE the zone is being tampered with)
> 
> That's not how things work. The DNS server is either capable or not
> capable of doing DNSSEC. That is not a "per zone" thing. If it fails
> to return RRSIG signature records for the root zone, there is nothing
> you can do but forget about that server. (technically speaking, I do
> what you say, if you consider "." to be "only that zone")

Okay that makes sense. 

> 
> > * Unbound flushes it's cache between interface state changes, because
> > you are moving between networks with different DNS views of the world.
> 
> I am not convinced that is required. It does a lot of damage too.
> 
> > * That you keep the DNS cache time short, to help avoid issues with DNS
> > admins who forcefully increase TTLs. Consider google, with the TTL of
> > 300. Perhaps even set each cached record to have a cache time of ttl or
> > 3600 which ever is lower.
> 
> No. As I stated repeatedly, we are NOT in the business of modifying DNS
> records. If people pubilsh long TTLs, we will honor those TTLs. Doing
> otherwise is similar to launching an attack on the nameservers of those
> domains, who might not be able to handle such short TTLs. Imagine if I
> run a domain using a nameserver on my DSL. TTL of 7200. The name gets
> known, and everyone starts hammering it because middle boxes cut the TTL
> to 300. That's irresponsible.

okay, but lets combine these two points. My ISP mucks with the TTL of
some website from say 300 to 30000000. Unbound would respect this to
that amount, or to the TTL max (Which is still 86400 iirc). If you
aren't flushing the cache between networks you could end up with:

* Suboptimal routes causing a poor user experience.
* Incorrect cached zone data moving between networks with different DNS
views of the world.

Ignoring the TTL change, lets just look at flushing between network
state change. This would solve both the dot points listed. You only need
to rebuild the cache on first network reconnect meaning:

* You are caching for that session the correct results as that network
sees them.
* You get the TTLs for that network (Even if they were tampered with)
* You don't take that data to other networks.

Alternately, consider a per-network cache? IE on one network I build a
named cache for that network, when I join the other I use that cache.

This way the cache:

* Persists across interface changes
* Keeps sane zone data in each network environment. 

This would also solve my 3g WPA hotspot case, given the cache built on
the hotspot doesn't polute my work wifi for example. 

> 
> > Im trying to think about the "user experience" of fedora here rather
> > than a technically perfect world. These suggestions will eliminate all
> > the concerns I have with this system and would hopefully make the
> > default experience better. :)
> 
> I think we are fairly close to agreement on what's needed. Thank you for
> your discussion this with us. It is clear now that we must flush the
> entire cache when we use a forwarder for more than one domain (eg not
> the VPN cases) when using authenticated networks. That is something I
> had not considered before.
> 

Thanks. That's what these mailing lists are for, even if it can end up
in essay length posts. At the end of the day, I'm sure we both want the
best user experience. 

In summary (Possibly something to add to the wiki)

* Unbound does captive portal detection. Detail how it's done (See above
in this email)

* Unbound tries to find dodgy DNS servers. Detail how this detection is
done. 

* On an open (Insecure) access point, unbound bypasses the local
forwarder, except for names listed in the single valued attribute
"options domain-name"  from dhcp 

* On a secure network (Encrypted wifi, lan) unbound will use the
forwarders as provided by DHCP.

* Unbound will flush the cache between authenticated networks. (If I
read your last point correctly)

Sincerely,

-- 
William Brown <william at firstyear.id.au>