DNS in OpenVPN: a better approach

/dev/rob0, 2012-11-17

A mini-howto demonstrating the use of dnsmasq in a multi-site VPN

I've seen lots of people on the OpenVPN users mailing list who jump through hoops and use awful kludges to make DNS work in their VPN.

Typically they want VPN/remote LAN names to resolve for VPN-connected hosts, or when redirecting the gateway for clients, they want to ensure that DNS is redirected also.

For Windows clients this is easy; simply push the DNS settings you want them to use. That does not work for Unix/Linux clients, because our DHCP clients know the difference between a tun interface and an interface which is truly capable of DHCP.

Is this a disadvantage for Unix/Linux? Not at all! We have many other, and better, options at our disposal.

Enter dnsmasq(8)

"Dnsmasq is a lightweight, easy to configure DNS forwarder and DHCP server," according to its web page. It's also a better client-side nscd replacement, and it can provide a seamless transition among multiple private namespaces. As such it is the perfect complement to OpenVPN.

The idea here is simple. Each VPN-connected site has its own domain, or "zone" in DNS terms, and coincidentally, also its own in-addr.arpa zone[s] for reverse DNS resolution of IP addresses in that site's own netblock. Each site serves its own names from its own dnsmasq instance. Each site's dnsmasq is configured to ask another site for names in that site's zone (as well as for PTRs for IP addresses in that site's netblock.)

If you're fluent in DNS concepts, that should be enough to get you started. If not, read on for an example.

Example Company

Our example company has three offices and a number of road warrior clients. The domain name is "example.com". Site servers at each site are Linux.

Florence

Florence (Alabama) is the home office, and it runs a VPN server for the roving clients. Local clients get their IP addresses in florence.pvt.example.com from the dnsmasq instance on the site server. The Florence server is on the Internet as Florence.example.com at 192.0.2.10, and its internal network is 172.30.8.0/22. Internal clients reach the server via 172.30.8.1. The VPN server's network is 172.30.6.0/24, and the VPN server's address is 172.30.6.1.

VPN clients are given TLS certificates in the form "user.vpn.example.com", and that is the name which is entered into the /etc/hosts file on the Florence site server.

TODO: a script to automate entries into /etc/hosts when a client is first assigned an IP address by the OpenVPN server. Another option, which might be cleaner: a dynamic zone in named which OpenVPN can populate using nsupdate(8) and its own TSIG key. (It's clean, but definitely not as simple as dnsmasq, and I want this to be a simple example.)

Savannah

Savannah (Tennessee) is one of two branch offices. Local clients get their IP addresses in savannah.pvt.example.com from the dnsmasq instance there. The Savannah server is on the Internet at 198.51.100.20 as Savannah.example.com, and its internal network is 172.30.16.0/24. Internal clients reach the server via 172.30.16.1.

Savannah.example.com runs two OpenVPN instances: one, a client of Florence; the other a p2p tunnel to Tupelo, q.v.. The former has an IP address of 172.30.6.16; the latter, 172.30.0.16.

Tupelo

Tupelo is the other branch office. Its hostname and internal DNS zone (also server by dnsmasq) is tupelo.example.com. Tupelo.example.com is 203.0.113.30 on the Internet. The LAN range is 172.30.32.0/23, and the server's LAN IP address is 172.30.32.1.

Tupelo.example.com also runs two OpenVPN instances: one, a client of Florence; the other a p2p tunnel to Savannah. The former has an IP address of 172.30.6.32; the latter, 172.30.0.32.

Dnsmasq configurations

The Florence.example.com server is authoritative for the zones Florence.pvt.example.com, vpn.example.com, and 8.30.172.in-addr.arpa through 11.30.172.in-addr.arpa.

The Savannah.example.com server is authoritative for the zones Savannah.pvt.example.com and 16.30.172.in-addr.arpa.

The Tupelo.example.com server is authoritative for the zones Tupelo.pvt.example.com and 32 and 33.30.172.in-addr.arpa.

Each server is configured to ask the others for names in its zones; this configuration is static and independent of the VPN status. If the VPN is down for any reason, resolution of uncached records will fail. (This is not a problem, because routing to those IP addresses would fail as well.)

Clients' configuration is simpler, fortunately. They merely have to ask Florence (the VPN server host) for both example.com subzones (pvt and vpn) and for 30.172.in-addr.arpa which covers all the reverse DNS. As with the servers, the clients' dnsmasq configuration is static and independent of the VPN status.

Note: only the Florence and VPN clients' configurations are fully commented.

Florence

# domain: the name appended to DHCP clients' hostnames
# and assumed for any non-qualified names queried
domain=Florence.pvt.example.com
# local: These are names never forwarded; authoritative here
local=/6.30.172.in-addr.arpa
local=/8.30.172.in-addr.arpa
local=/9.30.172.in-addr.arpa
local=/10.30.172.in-addr.arpa
local=/11.30.172.in-addr.arpa
local=/vpn.example.com
# we use this dnsmasq as this system's own/only resolver, so no
# reason to look in resolv.conf
no-resolv
# assuming we run named or other caching resolver on port 1053
server=127.0.0.1#1053
# this works if you don't
#server=8.8.4.4
# Privilege separation for maximum security; start as root and
# drop privileges to the following user/group
user=dnsmasq
group=dnsmasq
# Serve DHCP from 172.30.10.0/23 within a 172.30.8.0/22
# netblock, giving out two-hour leases.
dhcp-range=eth,172.30.10.0,172.30.11.255,255.255.248.0,2h
# other connected sites will cache our records for one hour
local-ttl=3600
# Savannah section: we go to Savannah for their names
server=/Savannah.pvt.example.com/172.30.6.16
server=/16.30.172.in-addr.arpa/172.30.6.16
# Tupelo tidbit: likewise for Tupelo
server=/Tupelo.pvt.example.com/172.30.6.32
server=/32.30.172.in-addr.arpa/172.30.6.32
server=/33.30.172.in-addr.arpa/172.30.6.32

Savannah

domain=Savannah.pvt.example.com
local=/16.30.172.in-addr.arpa
no-resolv
server=127.0.0.1#1053
#server=8.8.4.4
user=dnsmasq
group=dnsmasq
# Serve DHCP from 172.30.16.128/26 within a 172.30.16.0/24
# netblock, giving out two-hour leases.
dhcp-range=eth,172.30.16.128,172.30.16.191,255.255.255.0,2h
local-ttl=3600
# Florence fragment
server=/Florence.pvt.example.com/172.30.6.1
server=/6.30.172.in-addr.arpa/172.30.6.1
server=/8.30.172.in-addr.arpa/172.30.6.1
server=/9.30.172.in-addr.arpa/172.30.6.1
server=/10.30.172.in-addr.arpa/172.30.6.1
server=/11.30.172.in-addr.arpa/172.30.6.1
# Tupelo tidbit
server=/Tupelo.pvt.example.com/172.30.6.32
server=/32.30.172.in-addr.arpa/172.30.6.32
server=/33.30.172.in-addr.arpa/172.30.6.32

Tupelo

domain=Tupelo.pvt.example.com
local=/32.30.172.in-addr.arpa
local=/33.30.172.in-addr.arpa
no-resolv
server=127.0.0.1#1053
#server=8.8.4.4
user=dnsmasq
group=dnsmasq
# Serve DHCP for 172.30.33.0/24 within a 172.30.33.0/23
# netblock, giving out two-hour leases
dhcp-range=eth,172.30.33.0,172.30.33.255,255.255.254.0,2h
local-ttl=3600
# Florence fragment
server=/Florence.pvt.example.com/172.30.6.1
server=/6.30.172.in-addr.arpa/172.30.6.1
server=/8.30.172.in-addr.arpa/172.30.6.1
server=/9.30.172.in-addr.arpa/172.30.6.1
server=/10.30.172.in-addr.arpa/172.30.6.1
server=/11.30.172.in-addr.arpa/172.30.6.1
# Savannah section
server=/Savannah.pvt.example.com/172.30.0.16
server=/16.30.172.in-addr.arpa/172.30.0.16

VPN Clients

# We only serve DNS and only on loopback
no-dhcp-interface=lo
bind-interfaces
# we use this dnsmasq as this system's own/only resolver
listen-address=127.0.0.1
no-resolv
# assuming we run named or other caching resolver on port 1053
#server=127.0.0.1#1053
# this works if you don't
server=8.8.4.4
user=dnsmasq
group=dnsmasq
# queries for VPN names go to the VPN server, if we're
# connected. If not connected, they won't resolve.
server=/pvt.example.com/172.30.6.1
server=/vpn.example.com/172.30.6.1
server=/30.172.in-addr.arpa/172.30.6.1

That's it!

Nothing more to it, although just as with the global DNS, it can be built upon and expanded as needed. Name resolution is more or less constant; no fiddling with resolv.conf files, and no inconsistent results.

Those who are the most paranoid among us will never need to trust external nameservers. The sample dnsmasq.conf files referred to a named(8) running on an alternate port, 1053. In the interest of completeness, the very simple named.conf(5) for that is shown here:

options {
		listen-on port 1053 { 127.0.0.1; };
		allow-query { localhost; };
};

With that, named will use its built-in root hints and provide name service on port 1053 only for the local machine.

Questions?

Q: The sample shows both named and dnsmasq in use. Why would you want two nameservers on one machine?

A: They don't do the same thing. The named is doing recursion, or asking for others' names. The dnsmasq is authoritative, serving our own names. It is a common recommendation to keep your authoritative name service separate from caching/recursion resolver service.

Q: What's wrong with resolvconf or openresolv?

A: They just seem like the wrong approach, to me. Unix was built with the ability to run services that can make our lives easier. Why not do that? They're probably running nscd anyway, and as I said, dnsmasq is a better nscd. These scripts are kludges which rely on external nameservers, and I have found in many cases that ISPs do not provide reliable nameservers. It's not just paranoia: there are good reasons why you should want to control your own DNS cache.

Q: What's wrong with nscd?

A: To be honest, I have lived most of my Unix life away from nscd. I am primarily a Slackware Linux user, and Slackware includes but does not activate nscd. I am told that this is a deliberate design decision. I have heard of numerous problems with it, such as it not understanding or not seeing DNS TTL. Some of the things it caches don't need to be cached on typical file-based systems. (And replacing files which work well for the likes of protocols, services, and the mostly unused other NSS services does not make good sense. In some cases, user/group lookups might be handled by a network daemon, and thus benefit from caching. But for most general purposes the only NSS service which needs to be cached is "hosts", or DNS. Again, dnsmasq is true DNS caching software built with DNS in mind.) So in a nutshell, my prejudice against nscd is fed by its neglect in Slackware and hearsay. If you like it, I'm sorry. I'm going to continue to live without it.

Q: I sometimes need to use an external nameserver on a VPN client, because of its local names. How can I do that?

A: One solution is to expand on what was shown above. Add your other site's local zones to the client configuration:

server=/other.site/192.168.3.3
server=/3.168.192.in-addr.arpa/192.168.3.3

It does not matter that you're not always connected there. You won't be able to resolve those names, but you can't route to them anyway. And when you are there, it will work perfectly and transparently.

Another solution is to use hooks in your DHCP client to write nameservers to an alternate file, such as /var/lib/dhcpcd/resolv.conf, and add this as a --resolv-file to your configuration:

resolv-file=/var/lib/dhcpcd/resolv.conf

Q: What do I do if my sites' network names or addresses overlap or conflict?

A: Sigh. Sucks to be you. Apply a LART to your network administrator. Print RFC 1918 on a roll of toilet paper and give it as a gift. Really, there is no excuse for this. A site administrator should use a name which s/he controls and an RFC 1918 netblock which won't conflict with the common ones. You might be stuck with the resolvconf or openresolv kludge.

Question not listed here? Write to me at rob0-dnsvpn@nodns4.us.