~ky3ga39x/blog

0db4f1290b449fccadd410c4afea93509a8eafb2 — DS 5 months ago dccd070
New post
1 files changed, 78 insertions(+), 0 deletions(-)

A blog/content/posts/zfs-syncoid-slow.md
A blog/content/posts/zfs-syncoid-slow.md => blog/content/posts/zfs-syncoid-slow.md +78 -0
@@ 0,0 1,78 @@
---
title: "Setting up a ZFS backup with Syncoid, discovering why it's slow"
date: 2020-05-20T22:19:19-07:00
draft: false
---

For many years I've been running a NAS with ZFS to store my files.
Some time ago, I decided to stop being lazy and set up a backup ZFS system,
because [RAID is not a backup](https://serverfault.com/questions/2888/why-is-raid-not-a-backup).
For now, the backup machine is sitting a few feet away from the ZFS NAS,
which doesn't really protect me against some catastrophic scenarios,
but at least it's a start.
At some point I'll figure out a way to have an off-site backup solution
that I'm comfortable with (privacy, price, easy to use are things I care about).

Well, enough introduction, I just want to tell two short stories.
The first one: how to get [syncoid](https://github.com/jimsalterjrs/sanoid#syncoid) running in a way that's acceptable for me.
Syncoid is pretty great to sync ZFS datasets and snapshots between machines.
While looking for a solution to that, I didn't find anything easy enough and that handled as many edge cases as `syncoid` does,
so I decided to use it.
The catch is that if you try to just sync a ZFS dataset between two machines,
something like `syncoid pool/dataset user@remote:pool/dataset`,
you'll eventually see `syncoid` throwing a `sudo` error: "sudo: no tty present and no askpass program specified".
That's because it's trying to run a `sudo` command on the remote,
and `sudo` doesn't have a way to ask for a password with the way syncoid's running commands in the remote.

Searching online, I found many people just saying to enable SSH as root,
which might be fine on a local network, but I don't really like this.
Instead, I'm more comfortable just enabling passwordless `sudo` for `zfs` commands on my user.
Getting this done was very simple:

```
sudo visudo /etc/sudoers.d/zfs_receive_for_syncoid
```

And then fill it with the following:

```
<your user> ALL=NOPASSWD: /usr/sbin/zfs *
```

If you really want to put in the effort, you can even take a look at which `zfs` commands that `syncoid` is actually invoking,
and then restrict passwordless `sudo` only for those commands.
It's important that you do this for **all** commands that `syncoid` uses.
Syncoid runs a few `zfs` commands with sudo to list snapshots and get some other information on the remote machine before doing the transfer.
I had initially limited passwordless `sudo` only for `zfs receive *`,
and spent quite some time to figure out why `syncoid` was always trying to sync from the first snapshot -
in reality it just wasn't able to list snapshots on the remote machine, so it thought that there were none!

Well, after all of this fun, I noticed that the transfer speeds were really low, nearing 11MiB/s.
My machines are somewhat old, but not that old that they can't handle gigabit ethernet,
so I decided to investigate.

I ran `iperf -s` in one of the machines, and `iperf -c <remote ip> -d` on the other machine to check if this was a networking problem,
or some other problem (syncoid does some compression and buffering to try to make things faster,
so there could be something going on there).
To my surprise, I got close to 100MiB/s in one direction (from the remote machine to the ZFS NAS),
and about 20MiB/s in the other direction.
Looks network-related.
I ran `ethtool` on both ends to check if there was anything weird going on,
and surely enough, the remote machine reports a speed of 100Mb/s,
while the ZFS NAS reports 1000Mb/s.
To quickly confirm my theory of a bad cable,
I checked my router, which helpfully lights an extra LED when a link is gigabit.
There was only one LED coming from the remote machine, so that was that.
Replaced the cable with a different one, and suddenly I had 6 to 7 times faster transfer speeds. Yay!

That's pretty much it for this post, just wanted to tell those two small stories.
`syncoid` is still syncing the entire dataset to the other machine,
but from what I've seen, looks like I'll be a happy user of this tool.
I've been thinking about investigating [Nix and NixOS](https://nixos.org/learn.html)
and eventually migrate these two ZFS machines (which are currently on Ubuntu) to NixOS,
and make my life easier in the future whenever I need to set things up in another machine.
Nix and NixOS kind of remind me of the [Yocto project](https://www.yoctoproject.org/),
something I've worked with many years ago when developing firmware for some devices.
I really enjoyed Yocto, it was likely one of the first open source projects that I
thought was really well-polished.
I might make a post about Nix and NixOS in the future if/when I get to explore it some more.
\ No newline at end of file