~droyo/tapalloc

allocate ephemeral tap devices
325e7f3d — David Arroyo 18 hours ago
Serve usable domain data, domain stream
3c5da195 — David Arroyo 2 days ago
Add NETLINK_GET_STRICT_CHK option
1009c5f0 — David Arroyo 2 days ago
ontap_zone: basic functionality working

refs

main
browse  log 

clone

read-only
https://git.sr.ht/~droyo/tapalloc
read/write
git@git.sr.ht:~droyo/tapalloc

You can also use your local clone with git send-email.

The ontap commands allow non-privileged users to provision network
interfaces, which they can use for VMs, VPN tunnels, docker containers,
or really, whatever they want, in a way that does not compromise on
security or auditability.

Normally, creating a network interface is a privileged operation. So if
you want to run, for example, a virtual machine, and give it its own
networking device, you have to employ a workaround. You can utilize
user_namespaces(7) and network_namespaces(7) to gain the ability to
create your own network interfaces in your own namespace, but in order
to connect those to the outside world in a way that doesn't compromise
performance, you need elevated privileges.

* You can implement a full TCP/IP stack within your user process. This is
  what QEMU's `-netdev user` does. The drawback is that performance
  is poor, and your VM will not be visible on the network beyond your
  computer, so it is not well-suited for server use cases.

* You can use `sudo` to run the process as a root user. Even if you do
  have `sudo` access, any vulnerabilities in the program are now a means
  of privilege escalation. Some programs, like QEMU, provide a way to drop
  privileges after allocating privileged resources. Ignoring the fact
  that any code that executes *before* dropping privileges is critical
  code, now every application that needs privileged resources also needs
  to implement a way to drop privileges after it no longer needs them.

* You can use a "helper" program, owned by root, with the setuid bit set.
  This program, when executed, will run with superuser privileges. This
  has exactly the same drawbacks as using sudo. In addition, the
  file itself is a potential attack vector. It may sound like I'm
  hand-waving, but it has happened before (CVE 2013-1858). I don't think
  it's controversial to say that our systems would be more secure if we
  used the `nosuid` mount flag for all file systems.

* On Linux, you can grant the CAP_NET_ADMIN capability (see
  capabilities(7)) to the qemu binary, the qemu process, or to the helper
  program. This has similar drawbacks to setuid root, but reduces the
  damage if the program is compromised. You can also ultimately limit,
  with careful planning, the capabilities a process tree may inherit
  from a file, so long as they do not inherit CAP_SETPCAP or a superset
  of it. However, capabilities are still coarse; CAP_NET_ADMIN enables
  much more than just allocating network interfaces, and the capability
  will be available for the lifetime of the process. Still, this is
  much better than previous options; a trusted file or user with this
  capability can allocate the interface and then execv(2) into a new
  program, which (if configured properly) will drop the capability.

Another option, which the ontap suite implements, is to run a local
service, reached through a unix socket, which allocates a new tap device
on behalf of the client. It can then use a peculiar ability of unix
sockets, and send the descriptor for the tap device over the socket to
the client. The unix socket's permissions can be used to control access
to the service. In addition, the existence of getpeereid(3) enables the
service to make arbitrarily complex access logic using the uid and gid
of the requesting user.

This approach has the same flexibility as a setuid binary, without the
security drawbacks. See https://skarnet.org/software/s6/localservice.html
for more details on this "local service" approach.

Some of these utilities are meant to be run under a super-server like
`s6-ipcserver` or `ucspi-unix`, to be connected to a unix domain socket.
It could also be connected to a child or parent process via socketpair(3).
See the individual tools' man pages for more details.


# BUILD

This project was built with Ocaml, version 5.0. It does not use any of
the new concurrency features, so older versions (4.08 and above) will
probably work, but I have not tested them. To build, run

	dune build

The file `guix.scm` defines a Guix[1] package for ontap. If you have guix
installed, it can be built with

	guix build -f guix.scm

which has the benefit of pulling in the external dependencies, and, provided
you pin your version of the guix repository, allows for reproducible
builds. Alternatively, you can run

	guix shell -D

which will spawn a new shell with dependencies loaded, from which you can
develop the program with a consistent set of dependencies matching the
final package.

[1]: https://guix.gnu.org/
Do not follow this link