More systemd networking fun

Published:
February 2, 2024

What was once an elegant machine built up by the wisdom of thousands of volunteers is now a rusty bucket with holes; my Debian installation.

As I have recently fixed my interface configuration I thought I was done solving my networking problems.

On my local network, I rely on mDNS a lot, for both macOS as well as Debian. To ssh, mosh, and sync git annexes between computers, I use .local addresses extensively.

In this scenario, I have two computers:

E t h h e e l r i n u e m t U D R n r o i e u F a t i m e r W l i i F t i h i u m

To my surprise, when turning on my computers this morning, I noticed that I wasn’t able to mosh into my MacBook Air to look at emails through neomutt, or access my Pomodoro timer (which I named Pomoglorbo).

I usually connect using mosh lithium.local tmux from my helium workstation. When I try to ping lithium.local, I get:

ping: lithium.local: Name or service not known

But interestingly, when I try pinging just lithium, which I don’t remember working before, I would get

PING lithium (10.0.57.235) 56(84) bytes of data.
64 bytes from lithium (10.0.57.235): icmp_seq=1 ttl=64 time=2.11 ms
64 bytes from lithium (10.0.57.235): icmp_seq=2 ttl=64 time=2.70 ms
64 bytes from lithium (10.0.57.235): icmp_seq=3 ttl=64 time=2.09 ms

and at least I know that there is a route. So now I can try and see if I can’t ping myself from lithium while ssh’ing into it from helium. First, we try using the .local mDNS domain:

# ssh lithium ping -c 2 helium.local
PING helium.local (10.0.56.202): 56 data bytes
64 bytes from 10.0.56.202: icmp_seq=0 ttl=64 time=4.288 ms
64 bytes from 10.0.56.202: icmp_seq=1 ttl=64 time=2.279 ms

--- helium.local ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 2.279/3.284/4.288/1.005 ms

And for good measure we see what happens when we ping our Debian workstation without the .local suffix:

# ssh lithium ping -c 2 helium
PING helium (10.0.56.202): 56 data bytes
64 bytes from 10.0.56.202: icmp_seq=0 ttl=64 time=2.281 ms
64 bytes from 10.0.56.202: icmp_seq=1 ttl=64 time=2.765 ms

--- helium ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 2.281/2.523/2.765/0.242 ms

Finally, we ping ourselves using our mDNS domain and frustratingly get a good response:

# ping -c 1 helium.local
PING helium.local (10.0.56.202) 56(84) bytes of data.
64 bytes from helium (10.0.56.202): icmp_seq=1 ttl=64 time=0.052 ms

--- helium.local ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.052/0.052/0.052/0.000 ms

So, we know that

  1. There is a route between the two computers.
  2. Somehow, helium can still resolve a name for lithium, just not lithium.local.
  3. lithium can resolve helium.local without any difficulties.
  4. helium can resolve helium.local.

Most likely, something is wrong with the mDNS configuration on helium. Currently we run Avahi as our mDNS resolver, which we can confirm by running systemctl status avahi-daemon.service.

# systemctl status avahi-daemon.service
● avahi-daemon.service - Avahi mDNS/DNS-SD Stack
     Loaded: loaded (/lib/systemd/system/avahi-daemon.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-02-02 09:52:51 JST; 1h 27min ago
TriggeredBy: ● avahi-daemon.socket
[...]

And if we try to resolve devices on the local network using avahi-browse, we see:

# avahi-browse --all --resolve --terminate | head
+ enp7s0 IPv6 Brother HL-L2375DW series                     Web Site             local
+ enp7s0 IPv4 Brother HL-L2375DW series                     Web Site             local
+ enp7s0 IPv6 Brother HL-L2375DW series                     Internet Printer     local
+ enp7s0 IPv4 Brother HL-L2375DW series                     Internet Printer     local
+ enp7s0 IPv6 Brother HL-L2375DW series                     UNIX Printer         local
+ enp7s0 IPv4 Brother HL-L2375DW series                     UNIX Printer         local
+ enp7s0 IPv6 Brother HL-L2375DW series                     PDL Printer          local
+ enp7s0 IPv4 Brother HL-L2375DW series                     PDL Printer          local

Avahi is the mDNS resolver I deserved pre-installed, but not the one I need right now. So I’ll uninstall it and see how deep the rabbit hole of mDNS configuration goes. (Also, breaking your system and fixing it is exciting)

systemd-resolved

systemd-resolved is absolutely not the standard network name resolver on Debian. The Debian 12 (bookworm) release notes somewhat mention that systemd-resolved is a second class citizen.

Note that systemd-resolved was not, and still is not, the default DNS resolver in Debian.

That has never kept me from chasing my dreams.

First I remove avahi-*, and install systemd-resolved:

# sudo apt remove 'avahi-*'
[...]
The following packages will be REMOVED:
  avahi-autoipd avahi-daemon avahi-utils
0 upgraded, 0 newly installed, 3 to remove and 0 not upgraded.
After this operation, 528 kB disk space will be freed.
[...]
sudo apt install systemd-resolved
[...]
The following additional packages will be installed:
  libnss-resolve
The following NEW packages will be installed:
  libnss-resolve systemd-resolved
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 402 kB of archives.
After this operation, 1,028 kB of additional disk space will be used.
[...]
Converting /etc/resolv.conf to a symlink to /run/systemd/resolve/stub-resolv.conf...
Created symlink /etc/systemd/system/dbus-org.freedesktop.resolve1.service → /lib/systemd/system/systemd-resolved.service.
Created symlink /etc/systemd/system/sysinit.target.wants/systemd-resolved.service → /lib/systemd/system/systemd-resolved.service.
[...]

We test what happens in this twilight state of mDNS resolution and try to resolve lithium.local (the MacBook) and helium.local (ourselves) address:

# ping -c 1 lithium.local
ping: lithium.local: Temporary failure in name resolution
# ping -c 1 -w 5 helium.local
PING helium.local(helium (fe80::dabb:c1ff:fed0:357c%enp7s0)) 56 data bytes

--- helium.local ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4092ms

For the first result, I don’t have a good mental model for the internals of the whole chain going from ping to mDNS and to sending an actual ICMP packet, but name resolution failing certainly does make some sense.

For the second one, it seems like our network interface’s IPv6 is resolved without any response coming back. This could be because of an mDNS issue, or because we don’t respond to ICMPv6. Simply listening with tcpdump for ICMP6 echo requests yields nothing:

$ sudo tcpdump "icmp6[0] == 8"
<- imagine nothing here

The service itself is running, which is good.

# systemctl is-active systemd-resolved.service
active

We run resolvectl status to check the status of multicast DNS and see that it is not enabled on enp7s0, our Ethernet interface:

# resolvectl status
Global
       Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 2 (enp7s0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 10.0.48.1
       DNS Servers: 10.0.48.1

Fine, so we go on actually configuring systemd-resolved and refer to the ArchWiki article on this matter.

Our /etc/systemd folder has the following resolved related files:

# fd resolve /etc/systemd/
/etc/systemd/resolved.conf
/etc/systemd/system/dbus-org.freedesktop.resolve1.service
/etc/systemd/system/sysinit.target.wants/systemd-resolved.service

We see that /etc/resolv.conf is now symlinked and managed by systemd-resolved:

# cat /etc/resolv.conf (symlinked to -> ../run/systemd/resolve/stub-resolv.conf)
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
[...]
nameserver 127.0.0.53
options edns0 trust-ad
search .

That looks perfectly fine. We are able to resolve regular domains on the Internet, and we now just have to make mDNS work in our local network. We refer to the section on mDNS configuration and update the /etc/systemd/network/10-ethernet.network, which we have previously created to make systemd-networkd handle network configuration for us automatically.

The contents currently are:

# /etc/systemd/network/10-ethernet.network
[Match]
Type=ether

[Network]
DHCP=yes

We add the following line at the end of the [Network] section:

# ...
[Network]
DHCP=yes
MulticastDNS=yes

We run sudo networkctl reload and resolvectl status enp7s0 to see if anything has changed:

# resolvectl status enp7s0
Link 2 (enp7s0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6 mDNS/IPv4 mDNS/IPv6
         Protocols: +DefaultRoute +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 10.0.48.1
       DNS Servers: 10.0.48.1

At this point I suspect that there is something wrong with my MacBook, and I use third computer running macOS on the same network to see if I can ping it at lithium.local. It turns out I can’t. I restart lithium, and now I can reach it from the third computer. Perhaps, I have encountered a rare networking problem in macOS? Maybe, I am too quick to blame my humble Debian configuration skills.

Sometimes turning it off and on again is a cure to all kinds of problems. It’s anti-climactic, but we still feel good about having learned a new tool.

To round it off, we see if resolvectl can show us whether lithium.local is reachable:

# resolvectl query lithium.local
lithium.local: fe80::14af:74d4:5643:ecf8%2     -- link: enp7s0

-- Information acquired via protocol mDNS/IPv6 in 920us.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: cache

We have configured systemd-resolved correctly to resolve local network host names. And learned the value of restarting machines from time to time.

Bonus round: Allow both IPv4 as well as IPv6 for ~/.ssh/authorized_keys:

Given an SSH key fingerprint like ssh-rsa AAAA..., you can make it available for SSH authentication in your ~/.ssh/authorized_keys file for only certain networks like so:

from="10.0.0.0/8,fe80::*" ssh-rsa AAAA...

Where we allow two networks:

  1. 10.0.0.0/8 (private network), and
  2. fe80::* (the IPv6 equivalent of a private network).

This adds another layer of defense, not even letting devices authenticate with a key if they are on the wrong network. Of course, your /etc/ssh/sshd_config should have something like these lines:

KbdInteractiveAuthentication no
UsePAM yes

I would be thrilled to hear from you! Please share your thoughts and ideas with me via email.

Back to Index