~fkooman/vpn-daemon

OpenVPN & WireGuard Management Daemon
update Makefile
also count wgctrl as that is "our" code now
update dependencies

clone

read-only
https://git.sr.ht/~fkooman/vpn-daemon
read/write
git@git.sr.ht:~fkooman/vpn-daemon

You can also use your local clone with git send-email.

Summary: OpenVPN & WireGuard Management Daemon

Description: This is a simple daemon written in Go providing a HTTP(S) API as an abstraction on top of the management socket of (multiple) OpenVPN server process(es) and WireGuard.

License: MIT

NOTE: we are embedding code from the wgctrl-go project which is licensed under the MIT license. You can find it in the wgctrl directory of this project.

#API

#Info

Call Description Method
/i/node Request Node Information GET

#Node Information

Request node information.

#Request
$ curl http://localhost:41194/i/node
#Response
HTTP/1.1 200 OK
Content-Type: application/json

{
    "cpu_count": 2,
    "load_average": [
        0.55,
        0.76,
        0.75
    ],
    "maintenance_mode": false,
    "node_uptime": 3828,
    "rel_load_average": [
        27,
        38,
        37
    ]
}

The load_average is the output of /proc/loadavg. See proc(5) for more information. The cpu_count field indicates the number of CPUs available when the daemon started. The rel_load_average field shows the percentage of system load, i.e. 100 * load_average / cpu_count.

If the system the daemon runs on has no /proc/loadavg or it is not available an empty array will be returned for rel_load_average and load_average, i.e. [].

The node_uptime is the output of the first field of /proc/uptime converted to int. It indicates the number of seconds since the system was booted, i.e. system uptime.

NOTE: cpu_count shows the number of CPUs the host had the moment the daemon was started. To reflect changes to number of CPU cores, the vpn-daemon process MUST be restarted.

maintenance_mode is true if the node does not want to accept new connections, for example in preparation for a software update or reboot. It is set to true if the path /run/vpn-daemon/maintenance-mode exists.

#WireGuard

Call Description Method
/w/add_peer Add a new peer POST
/w/peer_list List of configured peers GET
/w/remove_peer Remove a peer POST

#Add Peer

Add a new WireGuard peer.

Parameter Description Required Example(s)
public_key The public key Yes kNLsIjAxQ8w3PWgj+Bx8mDComLFPKgCrxPvvq0Lsq2s=
ip_net The IP network address(es) of the peer Yes, >= 1 10.75.24.9/32, fdd1:6916:e5a0:6999::9/128
#Request

To specify both an IPv4 and IPv6 address that the peer is allowed to use, you can use this call:

$ curl \
    -d 'ip_net=10.75.24.9/32' \
    -d 'ip_net=fdd1:6916:e5a0:6999::9/128' \
    --data-urlencode 'public_key=kNLsIjAxQ8w3PWgj+Bx8mDComLFPKgCrxPvvq0Lsq2s=' \
    http://localhost:41194/w/add_peer
#Response
HTTP/1.1 204 No Content

#Peer List

Get a list of configured peers. Note, this is not a list of currently connected WireGuard clients.

Parameter Description Required Default Example(s)
show_all Whether to also show "offline" peers No no yes, no

Offline peers are peers that have never performed a handshake with the WireGuard server, or performed the last handshake more than 3 minutes ago. The default is no.

#Request
$ curl http://localhost:41194/w/peer_list
#Response
HTTP/1.1 200 OK
Content-Type: application/json

{
    "peer_list": [
        {
            "bytes_in": 19732,
            "bytes_out": 32364,
            "ip_net": [
                "10.89.165.2/32",
                "fd07:7228:b99b:8b77::2/128"
            ],
            "last_handshake_time": "2021-12-28T13:31:04Z",
            "public_key": "0WFDZZRTFxbDjJRqZFuPabfh0pv6ZL9aG7i3Q47QXkQ="
        }
    ]
}

If the peer was not seen yet, the value of last_handshake_time is null. The format of last_handshake_time is RFC 3339, always in UTC, i.e. ends with Z.

#Remove Peer

Remove a peer.

Parameter Description Required Example
public_key The WireGuard public key Yes kNLsIjAxQ8w3PWgj+Bx8mDComLFPKgCrxPvvq0Lsq2s=
#Request
$ curl \
    --data-urlencode 'public_key=kNLsIjAxQ8w3PWgj+Bx8mDComLFPKgCrxPvvq0Lsq2s=' \
    http://localhost:41194/w/remove_peer
#Response
HTTP/1.1 200 OK
Content-Type: application/json

{
    "bytes_in": 21480,
    "bytes_out": 34200,
    "ip_net": [
        "10.89.165.2/32",
        "fd07:7228:b99b:8b77::2/128"
    ],
    "last_handshake_time": "2021-12-28T13:33:29Z",
    "public_key": "0WFDZZRTFxbDjJRqZFuPabfh0pv6ZL9aG7i3Q47QXkQ="
}

If the peer was not seen yet, the value of last_handshake_time is null. The format of last_handshake_time is RFC 3339, always in UTC, i.e. ends with Z.

Or, in case the peer was not registered at the node:

HTTP/1.1 204 No Content

#OpenVPN

Call Description Method
/o/connection_list List of connected clients GET
/o/disconnect_client Disconnect a client POST

#Connection List

Get a list of connected OpenVPN clients connected to any of the OpenVPN processes managed by this daemon.

#Request
$ curl 'http://localhost:41194/o/connection_list'
#Response
HTTP/1.1 200 OK
Content-Type: application/json

{
    "connection_list": [
        {
            "common_name": "YrAoH/x5dIyS2EiWOtRldUTg2WV9q9zV5evkaNkLN0E=",
            "ip_four": "10.222.172.3",
            "ip_six": "fde6:76bd:ad97:ac5e::3"
        }
    ]
}

#Disconnect Client

Disconnect a connected OpenVPN client. It will disconnect the VPN client from all OpenVPN processes managed by this daemon.

Parameter Description Required Example
common_name The client certificate "CN" Yes DDoG0rDvZ1YufRdDHk37MdgOX+lBgKoBzgOQKJ5dzy4=
#Request
$ curl \
    --data-urlencode 'common_name=DDoG0rDvZ1YufRdDHk37MdgOX+lBgKoBzgOQKJ5dzy4=' \
    'http://localhost:41194/o/disconnect_client'
#Response
HTTP/1.1 204 No Content

#Building

$ git clone https://git.sr.ht/~fkooman/vpn-daemon && cd vpn-daemon
$ go build -o vpn-daemon tuxed.net/vpn-daemon/cmd/...

A Makefile is also provided for your convenience.

#Running

You can run vpn-daemon as root, or use systemd to manage the daemon.

#TLS

By default, having access to the TCP port of vpn-daemon is enough to use the API. This is fine when the daemon is only used on one system. When deploying multiple nodes, TLS, i.e. HTTPS can be enabled with client certificate authentication.

TLS is automatically enabled when the CREDENTIALS_DIRECTORY environment variable is set. This variable should point to the directory containing the ca.crt, server.crt and server.key files. The ca.crt file will be used to validate the client certificates. If your CA uses intermediate certificates, they MUST be included in the server.crt as well.

In order to quickly set up your own CA and test TLS, you can do the following:

$ vpn-ca -init-ca
$ vpn-ca -server -name server
$ vpn-ca -client -name client

Start vpn-daemon:

$ sudo CREDENTIALS_DIRECTORY=. ./vpn-daemon

Now you can test accessing the daemon through TLS:

$ curl \
    --cacert ca.crt \
    --cert client.crt \
    --key client.key \
    --connect-to server:41194:localhost https://server:41194/i/node 

If your CA has already been installed in your OS, you do no need to specify the --cacert option to curl.

#Design

The daemon is written in Go. It interfaces with the OpenVPN server process(es) through file system socket(s) in /run/openvpn-server. This socket is used to retrieve the list of connected clients as well as disconnect clients based on "Common Name".

For WireGuard the daemon integrates the wgctrl library.

The daemon is secured with systemd which makes it possible to run it without root permissions, see the vpn-daemon.service file below.

Whenever a VPN peer is added, either through the portal or API the peer is immediately configured through the daemon(s) as well. For removal it is exactly the same, so there is no waiting time before a peer can connect.

Our approach at surviving reboots/crashes is that the portal that uses the daemon periodically, e.g. every 5 minutes, synchronizes the list of known WireGuard peers.

The VPN portal is the source of truth. It has a list of all WireGuard peers that should be configured through the daemon(s). The portal retrieves all peers known by the daemon and compares that list with the list it has in the database. Based on this it adds/removes peers known to the daemon. This way after a reboot of the node(s) within 5 minutes the node is back in the correct state.

We do not know how well this scales with thousands of VPN peers over multiple nodes, but we'll find this out and can optimize "Just in Time" when this no longer scales and push portal/daemon updates as needed.

#systemd

This is /etc/systemd/system/vpn-daemon.service:

[Unit]
Description=OpenVPN & WireGuard Management Daemon

[Service]
AmbientCapabilities=CAP_NET_ADMIN
Environment=WG_DEVICE=wg0
Environment=LISTEN=127.0.0.1:41194
ExecStart=/usr/sbin/vpn-daemon -wg-device ${WG_DEVICE} -listen ${LISTEN}
Restart=on-failure
PrivateDevices=yes
DynamicUser=yes
SupplementaryGroups=nogroup

[Install]
WantedBy=multi-user.target

The SupplementaryGroups contains the group that is used by the OpenVPN processes. On Fedora this would be openvpn.

On Debian, the permissions of the /run/openvpn-server folder are not correct if we want vpn-daemon to get access to them:

drwx--x---  2 root        root         240 Aug 24 22:05 openvpn-server

We'll have to modify the permissions by creating /etc/tmpfiles.d/openvpn.conf that will be an override of /usr/lib/tmpfiles.d/openvpn.conf with the second line changed, this will allow processes member of the group nogroup to access the socket:

d /run/openvpn-client 0710 root root -
d /run/openvpn-server 0750 root nogroup -
d	/run/openvpn	0755	root	root	-	-

Rebooting the system was required to fix the permissions.

#TLS

In order to enable TLS, you can put the following in the file /etc/systemd/system/vpn-daemon.service.d/credentials.conf:

[Service]
LoadCredential=ca.crt:/etc/ssl/vpn-daemon/ca.crt
LoadCredential=server.crt:/etc/ssl/vpn-daemon/server.crt
LoadCredential=server.key:/etc/ssl/vpn-daemon/private/server.key

This uses the Credentials feature of systemd. Make sure the certificates and key are put in the right place, and run the following commands:

$ sudo systemctl daemon-reload
$ sudo systemctl restart vpn-daemon

#Without systemd "Credentials"

If systemd credentials is NOT available in your OS, for example because of a SELinux issue the CREDENTIALS_DIRECTORY can also be set as an environment variable in /etc/systemd/system/vpn-daemon.service under [Service], e.g.:

Environment=CREDENTIALS_DIRECTORY=/etc/vpn-daemon
User=vpn-daemon
Group=vpn-daemon
DynamicUser=no

Make sure you remove the DynamicUser as you'll need to create a system user vpn-daemon that has permissions to read the certificates and key in /etc/vpn-daemon.

Run the following commands:

$ sudo systemctl daemon-reload
$ sudo systemctl restart vpn-daemon