~fluix/fluix.one

aa1fb92ad5aaaccddfdf8a7203ea1efeafcbe457 — Steven Guikal 1 year, 2 months ago e9706f8
Add UW CSC mirror hack posts
2 files changed, 401 insertions(+), 0 deletions(-)

A content/blog/uwaterloo-mirror-hack-1.md
A content/blog/uwaterloo-mirror-hack-2.md
A content/blog/uwaterloo-mirror-hack-1.md => content/blog/uwaterloo-mirror-hack-1.md +325 -0
@@ 0,0 1,325 @@
---
title: "Thoughts on the csclub.uwaterloo.ca mirror hack"
date: 2023-07-09T20:49:00-04:00
draft: false
---

On April 21st, 2023, at 21:05:55, an IP address in Marrakesh, Morocco maliciously uploads the alexusMailer anonymous mailing PHP script to the University of Waterloo's Computer Science Club (CSC) open source software mirror. A week later, on the chilly night of April 29th, 2023, at 11:44:00, past and present sysadmins of the University of Waterloo's Computer Science Club notice the unusual file and collectively agree that the mirror's server has been breached.

Staying up into the early hours of the following morning, they scour through logs, identify and patch the root cause, implement additional safeguards, and inform all relevant parties.

{{< u raymo "Raymond Li" "https://raymond.li/">}}, a member of the Systems Committee (syscom) of the CSC for several years and present sysadmin recounts this incident from his perspective in two articles, ["FTPWND PART 1: SILENT BUT DEADLY"](https://mathnews.uwaterloo.ca/wp-content/uploads/2023/06/mathNEWS-152-2.pdf#page=8&zoom=auto,-188,778) and ["FTPWND PART 2: CLOSE SHAVE"](https://mathnews.uwaterloo.ca/wp-content/uploads/2023/06/mathNEWS-152-3.pdf#page=16), published in the 152nd volume of the mathNEWS paper. A little birdie recently brought these to my attention and, figuring I have relevant knowledge and similar experiences, I want to comment on his writing. He also helpfully left some questions and exercises for the reader to explore which I'll go through as well.

*If you have little interest in a summary of the articles, [skip to part 2](/blog/uwaterloo-mirror-hack-2).*

## FTPWND PART 1: SILENT BUT DEADLY

{{< u raymo >}} begins by describing his initial exposure to the incident, a few hours after the dust had already settled. He wakes up on April 30th, after returning from vacation, to pings from IRC discussing the breach.

> We read about breaches and leaks and “pwn”s all the time, but they always
> seem to happen to others, and we think we’re foolproof. This is the story of
> how a system I was responsible for was breached.

I think most people exploring system administration come in with a feeling that they won't mess it up. It's _easy_ to follow instructions, read documentation, and not make _silly_ mistakes, right? With time, you gain some wisdom and an understanding that security is hard, with many issue hidden away in the unexplored interactions between components. Hopefully you understand this before it's too late, before you're responsible for user's personal data or some important systems.

My own wake up call happened when my high school's computer club (MCPT) judging server was pwned because we had left root SSH access open without a password. Of course it was a _little_ bit more complicated than that &mdash; you can read the [incident report here](https://github.com/mcpt/wlmoj-incidents/blob/cstate/reports/2021-10-28-judge-breach.md).

### ACT I: Background

This section primarily discusses {{< u raymo >}}'s role in the CSC, the purpose of a software mirror, a brief description of the FTP protocol, and the truly impressive size of the CSC club's mirror.

> If you look at the list of Arch Linux (another popular distribution) mirrors,
> you’ll see the CSC mirror listed as the only Tier-1 mirror (that is, mirrors
> that sync directly from the original source, and that other mirrors sync
> from) in Canada.[^2] If our mirror were to go offline, that would result in a
> chain reaction, possibly causing other mirrors to go out of date as well.
>
> [^2]: https://archlinux.org/mirrors/tier/1/

I wasn't sure if tier 2 mirrors sync from only a single mirror, but it appears that is indeed the case from the wording on the [DeveloperWiki:NewMirrors page](https://wiki.archlinux.org/title/DeveloperWiki:NewMirrors#Create_a_feature-request). I would not be surprised if all twelve tier 2 Canadian mirrors[^a] sync from the tier 1 CSC mirror, so all of them would go out of sync if it went offline. This would prove problematic, but not catastrophic, as users can switch to other mirrors.

[^a]: https://archlinux.org/mirrors/tier/2/

> FTP servers can be configured to allow “anonymous” logins, which allow the
> downloading and uploading of files without logging in.[^4] This is commonly
> used by mirrors like the CSC’s, where anyone should be allowed to download
> mirrored files without logging in. As the title of this post suggests, FTP
> was the failing link that enabled the breach.
>
> [^4]: https://www.investopedia.com/terms/f/ftp-file-transfer-protocol.asp

This is the first hint of what was involved in the server's breach: the FTP server. I think it should also be noted that FTP hasn't been recommended for mirroring by some distributions for over a decade. Arch Linux, for instance, had discussions to remove ftp mirrors from default mirrorlists in 2012,[^c] denied "accepting new ftp mirrors" since 2015,[^b] and now has only 1 of 468 mirrors, [librelabucm.org](https://archlinux.org/mirrors/librelabucm.org/) in Spain, advertising the protocol. HTTP and HTTPS are the de facto standards.

[^b]: https://wiki.archlinux.org/index.php?title=DeveloperWiki:NewMirrors&diff=357525&oldid=357519
[^c]: https://lists.archlinux.org/archives/list/arch-dev-public@lists.archlinux.org/thread/QKZVK63V24T2VHSPKW2RICLCRNADBSDB/#BDUSRZG7JHZOCXVLG2BAC44TL6SAGNNG

### Act II: Configuration

> [...] mirrored sources are fetched regularly from upstream by
> “potassium-benzoate”, the machine that serves as our mirror, using a golang
> script called “merlin”, developed in-house by syscom. Merlin fetches to the
> /mirror/ root directory, and is run by the “mirror” user, which has no
> password, since potassium-benzoate is only accessible by syscom users, all of
> whom are trusted.

I think not having a password on a user with minimal privileges is pretty common and acceptable for the reasons mentioned, particularly when a machine doesn't have other unprivileged users which could use the account for escalation.[^d] It certainly would not hurt to add a password to the user from a security standpoint, but it is unlikely anybody is logging in to this user directly anyways. Instead, a privileged user will change their uid/gid to mirror's through a command like `su` without being prompted for a password.

If this is true, it would be better for the user to have an _unmatchable_ password, meaning no value entered would ever match with the hashed version of the password stored in `/etc/shadow`. This is effectively the same as generating a strong password and promptly forgetting it. The value used by `passwd --lock` is `!`.

Alternatively, if the mirror user is only meant to be used as one under which other daemons and scripts are run, it could have a nonexistent login shell (or the more user-friendly `nologin` replacement shell). Privileged users can still run commands under it by using, e.g. `doas -u mirror`.

[^d]: I do not know if this is the case with `potassium-benzonate`.

On the night of April 29th, the [ProFTPD](http://www.proftpd.org/) FTP daemon configuration contained the snippet below:

```apache
<Anonymous /mirror/root>
  # Limit WRITE everywhere in the anonymous chroot
  <Directory *>
    <Limit WRITE>
      DenyAll
    </Limit>
  </Directory>
</Anonymous>
```

{{< u raymo >}} describes the functionality of this snippet as such:

> The config file essentially creates an anonymous chroot at /mirror/root that
> allows for anyone to connect via FTP, without authentication, and download
> but not upload files.

My first thought when reading this was that the wording of <q>anonymous chroot</q> is strange. `chroot` is a Linux syscall that changes the root directory of the calling process and its children, often poorly used for sandboxing.[^e] The meaning is clarified by the [`<Anonymous>` directive docs](http://www.proftpd.org/docs/modules/mod_core.html#Anonymous):

>  **Syntax:** `<Anonymous anon-directory>`
>
> The `<Anonymous>` configuration section is used to create an anonymous FTP
> login, and is closed by a matching `</Anonymous>` directive. The
> `anon-directory` parameter specifies the directory to which the daemon,
> immediately after successful authentication, will restrict the session via
> chroot(2). 


[^e]: `chroot(2)` explicitly states "it is not intended to be used for any kind of security purpose, neither to fully sandbox a process nor to restrict filesystem system calls."

The first article ends with two questions to the reader:

> What do you think is wrong with this configuration? How would you exploit it
> if you were an attacker?

While reading, I recall considering the nesting of `<Directory *>` being _inside_ the `<Anonymous>` directive, and so the `<Limit>` directive only applying to paths of the form `/mirror/root/*` but not those under other directories, e.g. `/mirror/foo`. Perhaps there was a path traversal vulnerability which let attackers upload files outside of `/mirror/root/`? We'll find out soon.

Thinking about it after the fact, I might consider that denying `WRITE` may not deny other non-read commands like modifying file permissions or deleting uploads. It turns out this is not the case; `WRITE` is actually a command _group_ covering all commands which modify file data.[^f]

[^f]: http://www.proftpd.org/docs/howto/Limit.html

## FTPWND PART 2: CLOSE SHAVE

### Act III: Pwned

{{< u zseguin "Zachary Seguin" "https://zacharyseguin.ca/" >}}, {{< u merenber "Max Erenber" "https://maxerenberg.github.io/" >}}, and {{< u szclsya "Leo Shen" "https://szclsya.me/" >}}, the syscom members online at the time, consider different methods of compromise, look through logs, and find [`pam_unix`](https://linux.die.net/man/8/pam_unix) and ProFTPD logs matching the malicious files and their timestamps. Of course on a breached machine logs can't always be trusted, but they're okay to look at especially when working under the assumption that only minimal compromise of specific accounts was achieved.

FTP transfer logs are presented in the article,[^logs] with attention being brought to the different directory and flags associated with the usual vs. malicious traffic.

[^logs]: I've removed the first few components of the logs, aligned the flags after the filename, shortened paths, and fixed some missing underscore characters corresponding to the `special-action-flag`.

```txt
/mirror/root/.../lirc-0.10.1.tar.bz2         b _ o a -wget@ ftp 0 * c
/mirror/root/.../lirc_0.10.1-7.debian.tar.xz b _ o a -wget@ ftp 0 * c
/mirror/root/.../libcue-2.2.1.tar.gz         b _ o a -wget@ ftp 0 * c
/home/mirror/ARS.php                         a _ i r mirror ftp 0 * c
/home/mirror/alexusMailer_v2.0.php           a _ i r mirror ftp 0 * c
/mirror/root/.../fluidsynth-2.3.1.tar.gz     b _ o a -wget@ ftp 0 * c
```

These logs show that the files were uploaded by an authenticated FTP session as the mirror user. The single character flags that differ between the normal (`o a`) and malicious (`i r mirror`) transfers have the following meanings, from [xferlog(5)](https://linux.die.net/man/5/xferlog):

 - `i`: incoming transfer
 - `r mirror`: local authenticated user, `mirror`
 - `o`: outgoing transfer
 - `a`: anonymous guest user

### Act IV: Damage Control

The root cause of the breach is found:

> Upon investigating, zseguin realises that there is a compromising default set
> by ProFTPD. It allows writes and logins by all users, by default.[^7]
>
> [^7]: Determined through experimentation, proof left as an exercise for the
>     reader

A user simply logged in to the mirror user and uploaded the malware to `/home/mirror/` because nothing prevented them.

The footnote states that this default was "determined through experimentation" leaving the proof as "an exercise for the reader." So my ~~hand~~ mind is forced; I must investigate.

#### A Footnote

Cloning the [ProFTPD source code](https://github.com/proftpd/proftpd/), I head for the `mod_auth` module which is responsible for the login process.[^g] It's source is available in [`modules/mod_auth.c`](https://github.com/proftpd/proftpd/blob/99c82a5745e1622ecb17bf9fa153778e5f0fd3a5/modules/mod_auth.c) and, after various smaller functions, contains **<q>the biggie</q>**:

[^g]: http://www.proftpd.org/docs/modules/

```c
/* Next function (the biggie) handles all authentication, setting
 * up chroot() jail, etc.
 */
static int setup_env(pool *p, cmd_rec *cmd, const char *user, char *pass)
```

This function clocks in at just over 1,000 lines, but is mostly linear and easy to read. The comments and logging statements also improve readability, but I'm only trying to answer the question of where the default lies, so I'll liberally skip unrelated code.

The first parts of this function handle the Anonymous, UserAlias, and RootLogin directives, and user, password, and group lookup errors, so let's skip to [line 1139](https://github.com/proftpd/proftpd/blob/99c82a5745e1622ecb17bf9fa153778e5f0fd3a5/modules/mod_auth.c#L1139) of the file:

```c
/* If c != NULL from this point on, we have an anonymous login */
aclp = login_check_limits(main_server->conf, FALSE, TRUE, &i);
```

This call returns FALSE (`0`) if any `<Limit LOGIN>` directive matches, preventing the user from logging in, and TRUE (`1`) otherwise. For a default configuration, this returns 1 since no `<Limit LOGIN>` directives are present. Following the comment, I'll exclude all blocks for anonymous logins. Proceeding on [line 1187](https://github.com/proftpd/proftpd/blob/99c82a5745e1622ecb17bf9fa153778e5f0fd3a5/modules/mod_auth.c#L1187):

```c
if (c == NULL &&
    aclp == 0) {
  pr_log_auth(PR_LOG_NOTICE,
    "USER %s (Login failed): Limit access denies login", origuser);
  goto auth_failure;
}

if (c == NULL ||
    (anon_require_passwd != NULL &&
     *anon_require_passwd == TRUE)) {
    // ...
```

This checks the previously set `aclp` and then begins more checks nested inside of this check which we pass because `c = NULL`. On [line 1224](https://github.com/proftpd/proftpd/blob/99c82a5745e1622ecb17bf9fa153778e5f0fd3a5/modules/mod_auth.c#L1224):

```c
/* It is possible for the user to have already been authenticated during
 * the handling of the USER command, as by an RFC2228 mechanism.  If
 * that had happened, we won't need to call do_auth() here.
 */
if (!authenticated_without_pass) {
  auth_code = do_auth(p, c ? c->subset : main_server->conf, user_name,
    pass);

} else {
  auth_code = PR_AUTH_OK_NO_PASS;
}
```

By default, the login requires a password so `do_auth` is called which, based on the configuration, will attempt authentication through any number of authentication modules. By default, this will include the `mod_auth_unix` module which will authenticate through multiple Unix login mechanisms like `/etc/shadow`. There's no reason this fails by default so at this point we're already essentially logged in, but let's continue on to [line 1247](https://github.com/proftpd/proftpd/blob/99c82a5745e1622ecb17bf9fa153778e5f0fd3a5/modules/mod_auth.c#L1247):

```c
switch (auth_code) {
  case PR_AUTH_OK_NO_PASS:
    auth_pass_resp_code = R_232;
    break;

  case PR_AUTH_OK:
    auth_pass_resp_code = R_230;
    break;

  case PR_AUTH_NOPWD:
    pr_log_auth(PR_LOG_NOTICE,
      "USER %s (Login failed): No such user found", user);
    goto auth_failure;

  // ...many more auth_failure cases omitted.

  default:
    break;
};

/* Catch the case where we forgot to handle a bad auth code above. */
if (auth_code < 0) {
  goto auth_failure;
}
```

Here we check for `auth_code` which will be `PR_AUTH_OK` and continue on to two important checks starting at [line 1332](https://github.com/proftpd/proftpd/blob/99c82a5745e1622ecb17bf9fa153778e5f0fd3a5/modules/mod_auth.c#L1332):

```c
res = pr_auth_is_valid_shell(c ? c->subset : main_server->conf,
  pw->pw_shell);
if (res == FALSE) {
  pr_log_auth(PR_LOG_NOTICE, "USER %s (Login failed): Invalid shell: '%s'",
    user, pw->pw_shell);
  goto auth_failure;
}

res = pr_auth_banned_by_ftpusers(c ? c->subset : main_server->conf,
  pw->pw_name);
if (res == TRUE) {
  pr_log_auth(PR_LOG_NOTICE, "USER %s (Login failed): User in "
    PR_FTPUSERS_PATH, user);
  goto auth_failure;
}
```

The first check determines whether we have a valid shell or not. It takes into acount the value of the [`RequireValidShell` directive](http://proftpd.org/docs/modules/mod_auth.html#RequireValidShell) which defaults to `on`, so users are required to have a valid shell, and most users on a system do.

The second check is based on a special legacy file which sits at `/etc/ftpusers`. As the [`UseFtpUsers` directive](http://proftpd.org/docs/modules/mod_auth.html#UseFtpUsers) explains, this is a list of users which are _not_ allowed to login to the FTP daemon. By default, this file is of course blank and so the check passes.

Then there's handling of some more directives and a few more checks depending on configuration but none that really affect the default. We reach the end:

```c
/* Authentication complete, user logged in, now kill the login
 * timer.
 */
```

So the <q>compromosing default</q> is true, mainly because ProFTPD relies on standard Unix authentication mechanisms and is _designed_ to allow all users who could login to the server normally to also do so over FTP and upload files to the server.

### Act IV: Damage Control (continued)

I would like to point out that, while the default in some sense "caused" the breach, I don't think it's an entirely bad default. Consider SSH for instance: all SSH servers that I know of, default to allowing all logins subject to similar checks as ProFTPD. You might say that this is an exception because the entire purpose of SSH is to _remotely access_ the server, but then what about sftp (OpenSSH secure file transfer)?

ProFTPd's documentation could use some work when it comes to default scenarios and mention more of the assumptions it makes because of the older FTP servers it is based off of. However, one should not be surprised that any daemon which relies on an external authentication mechanism *trusts* that authentication mechanism. I also find the rest of its documentation regarding directives and modules to be very comprehensive; and, as mentioned, it's code is quite readable, at least the parts I've read.

To fix the breach, the diff below gets applied:

```diff
 <Anonymous /mirror/root>
   # Limit WRITE everywhere in the anonymous chroot
   <Directory *>
     <Limit WRITE>
       DenyAll
     </Limit>
   </Directory>
+  <Limit LOGIN>
+    AllowAll
+  </Limit>
 </Anonymous>
+<Limit LOGIN>
+  DenyAll
+</Limit>
```

This globally denies `LOGIN` commands except in the anonymous mirror directory. I think this is a simple, robust fix &mdash; good properties when it comes to security changes.

> Private SSH keys stored in the “mirror” user’s home directory (used to update
> from secured upstreams) are rotated. Emails are sent to relevant upstream
> projects to notify them of the breach, and to invalidate the compromised SSH
> keys and rsync passwords. The University’s Information Systems and Technology
> Information Security Services Security Operations Center (IST SOC) is
> notified of the breach, since it occurred on the campus network.

Once the breach is contained, I think this is the right step to take in incident response. While you could wait to implement more security measures and do more investigation, informing everyone affected as soon as possible makes sure they can follow their own incident response plans more effectively. Ultimately, even if unaffected, I think informing users is also important to let them make their own decisions about trusting your systems and services.

When the MCPT judge was breached, our supervisor was informed the next morning and I published an incident report [publicly](https://github.com/mcpt/wlmoj-incidents/commit/81bdd70d29d56b73df277e9f689b7a01919377c4) before heading to bed the same night.

> Finally, five hours after the initial discovery of the breach at midnight,
> the syscom members on duty could get some sleep.

And I hope they slept well, because they deserve it!

Personally, I don't recall if I was finally calm after getting everything resolved or worried about the potential consequences for the club...

### ACT V: LESSONS LEARNED

> When I woke up that morning (my first day back from vacation) and checked
> IRC, my heart nearly stopped. This could have been catastrophic. An attacker
> could have modified the sync scripts to compromise all the files we served
> from our mirror.

And this is where I disagree. [Read on in part 2](/blog/uwaterloo-mirror-hack-2).

A content/blog/uwaterloo-mirror-hack-2.md => content/blog/uwaterloo-mirror-hack-2.md +76 -0
@@ 0,0 1,76 @@
---
title: "A mirror hack should not be catastrophic"
date: 2023-07-09T20:50:00-04:00
draft: false
---

This is a continuation of my [thoughts on the csclub.uwaterloo.ca mirror hack](/blog/uwaterloo-mirror-hack).

> This could have been catastrophic. An attacker could have modified the sync
> scripts to compromise all the files we served from our mirror.

Yes, they _could_ control all files served by the mirror, but this presumes that upstream projects trusted the mirror operators in the _first place_.[^a] While this may be the case for mirroring [their own files](http://mirror.csclub.uwaterloo.ca/csclub/), any third party project which lists the CSC mirror as an official mirror need not, and **should not**, naively trust that they will serve the same files they received.

[^a]: Projects that designate official mirrors trust and verify that operators will keep files up to date and maintain mirror availability.

If projects did, becoming a popular mirror and then swapping out files with malicious ones would be big business. And while it takes some effort to run a mirror, I would bet it's far simpler than many other methods of distributing malware.

Instead, projects *sign* the files (commonly application packages) they serve with known asymmetric encryption keys that users verify. For OS distributions like Arch or Alpine, users get a copy of the public key(s) in their initial download of the ISO from a first-party source, and then their respective package managers ([pacman](https://archlinux.org/pacman/) and [apk](https://gitlab.alpinelinux.org/alpine/apk-tools)) verify that all future downloads are signed by the respective private key(s).

Of the 61 projects mirrored, over 75% have signatures attached, so malicious files would be quickly detected and the mirror operators informed.[^b] There would be no catastrophe. This security is the very purpose of package signing.

[^b]: I originally had a table showing every mirrored project, whether the mirror was official, and whether there were signatures, but it took too long to check everything. Given that I cannot guarantee (in a reasonable amount of time) that the signatures are actually checked (by default) by package managers, that all files are signed, that the public key is obtained from a trusted source, or whether authenticity is guaranteed some other way, I figure I will leave it at this.

> Several things had gone wrong here. The “mirror” user didn’t have a password.
> The default settings of ProFTPD were too insecure for our use case, and the
> documentation is incredibly hard-to-read.[^8] We didn’t prevent logins via
> FTP. We didn’t prevent unnecessary access to home directories. We didn’t
> check our FTP logs often, or have an automated way to do it. If even one of
> these factors had been mitigated, the breach that we finally detected
> wouldn’t have occurred. Syscom would eventually switch over to vsftpd
> (very-secure FTP daemon, a more secure implementation with much more readable
> documentation) instead, for a simpler configuration to avoid further
> mishaps.[^9]
>
> [^8]: http://proftpd.org/docs/
> [^9]: https://security.appspot.com/vsftpd.html

I think the primary lessons here, of using software you personally understand, and minimizing your attack surface by choosing the most minimal, secure application for the job, are key. After all, as a system administrator &mdash; some of whom are even software *engineers* &mdash; responsible for important infrastructure, is not knowing the software you run a good excuse for getting pwned?

Of course I don't mean you should perform a software audit on everything you use, or know every part of the Linux kernel, or know all applicable RFCs by heart,[^rfc] but you should be confident in your choice of tool for the job. If all you need to do is run a script every hour, use cron, not Ansible; if you can use a new tool or reuse an existing one, prefer the latter. And yes, there's caveats and exceptions and difficulties in all of this, but that's part of what makes this job so interesting &mdash; figuring out what's best.

[^rfc]: Having read them is actually a pretty good idea.

Thankfully, this is something the CSC sysadmins know, so I'm happy to hear security is improving, and I'll continue to use the CSC mirror.

> When asked for comment for this blog post, zseguin emphasized that it’s
> “important to know what’s running, where your points of entry are and where
> your logs are”, [...]. [merenber] also mentioned that syscom “needs to be
> more proactive” and that “the vsftpd configuration is far simpler (only one
> config file)... This makes it far less likely for someone to screw it up.”

### Food for Thought

The article concludes with three questions to the reader.

> 1. Can you identify where ProFTPD documents (or even find the source
>    implementation) that it allows writes and logins from all users by
>    default?

See the [previous post](/blog/uwaterloo-mirror-hack) where I also discuss the rationale behind this default.

> 2. What are some additional ways we could have prevented this breach, or
>    detected it earlier, that aren’t mentioned in this post?

As analyzed in the previous post, there's several alternatives that could have prevented the breach (or fixed it afterwards):
 - Disable logging in to the mirror user.
 - Add the mirror user to `/etc/ftpusers`.
 - Run ProFTPD as the mirror user.[^c]

[^c]: This is documented in the [`<Anonymous>` directive](http://www.proftpd.org/docs/modules/mod_core.html#Anonymous) docs.

> 3. If you wrote an FTP daemon, what are some guard rails you would add to
>    prevent incidents like this?

I think the author wants readers to say "prevent all writes by default." That's fine, but I'd like to propose a different solution: keep the default but add a quick start guide that goes over starting the daemon and mention that all users can now upload files. Then, include a short snippet of how this can be prevented. This has the added benefit of being applicable to ProFTPD without breaking backwards compatibility.