~egtann/binp

composable binpacking tool
7746a19a — Evan Tann 2 months ago
fix net name in test services file
abffd380 — Evan Tann 2 months ago
rename network to net in config files
ddcce4d2 — Evan Tann 2 months ago
fix readme example mbps

refs

master
browse  log 

clone

read-only
https://git.sr.ht/~egtann/binp
read/write
git@git.sr.ht:~egtann/binp

You can also use your local clone with git send-email.

binp

binp (pronounced bin-pea) is a bin-packing tool to minimize costs when deploying architectures. It is similar to the functionality built into Kubernetes or Nomad, but this tool is designed under the UNIX philosophy of doing one thing well, favoring composability with other tools.

binp has one major function: given a description of a machine and services to run, how many machines are necessary, and which programs should run on each machine? In doing this, it also reports the excess resources, so you can decide if you want to decrease machine resources to save costs.

binp is both a library that can be used in other programs, as well as a CLI executable.

Install

$ go get -u egt.run/binp

How to use

binp accepts an inventory JSON file of servers and the services that can run on each of them and outputs a mapping of which services to run on numbered boxes.

For example, this could be our services.json file:

{
    "services": {
        "openbsd": {
            "app_1": {
                "cpu": 2,
                "ram": "4 GB",
                "disk": "100 MB",
                "count": 4,
                "net": {
                    "mbps": 500,
                    "ports": [3000, 3001]
                }
            },
            "app_2": {
                "cpu": 1,
                "ram": "2 GB",
                "disk": "50 MB",
                "count": 2,
                "net": {
                    "mbps": 100,
                    "ports": [3002]
                }
            },
            "postgres": {
                "cpu": 6,
                "ram": "32 GB",
                "disk": "1 TB",
                "count": 2,
                "net": {
                    "ports": [3002]
                }
            }
        },
        "debian": {
            "redis": {
                "cpu": 1,
                "ram": "2 GB",
                "disk": "4 GB",
                "count": 1
            }
        }
    },
    "boxes": {
        "openbsd": {
            "cpu": 12,
            "ram": "32 GB",
            "disk": "1 TB",
            "net": {"mbps": 1000}
        },
        "debian": {
            "cpu": 1,
            "ram": "4 GB",
            "disk": "10 GB",
            "net": {"mbps": 1000}
        }
    }
}

To run binp, we call this (output formatted slightly for legibility):

$ binp
{
    "debian":  [
        ["redis"]
    ],
    "openbsd": [
        ["postgres"],
        ["postgres"],
        ["app_1","app_2"],
        ["app_1","app_2"],
        ["app_1"],
        ["app_1"]
    ]
}

binp's output details the services which should be run on each server, and the total number of servers needed.

To output just the number of machines needed, compose binp's output with other tools, like jq!

# Get the total number of servers to provision
$ binp | jq 'flatten | length'

# Or get the number of servers for a single box type
$ binp | jq '.openbsd | length'

You could pass the length into something like Terraform to automatically provision the correct number of servers for your architecture, or the services themselves into Terrafirma.

To help save costs, binp can make recommendations as to the smallest CPU/RAM/Disk which meets the requirements in services.json file:

$ binp -min
openbsd:
    CPU:  6
    RAM:  32 GB
    Disk: 1 TB
    Net:  600 mbps
debian:
    CPU:  1
    RAM:  2 GB
    Disk: 1 TB
    Net:  0 mbps

Or the smallest CPU/RAM/Disk which will allow co-locating all services on a single box with -max.

$ binp -max
openbsd:
    CPU:  9
    RAM:  38 GB
    Disk: 1 TB
debian:
    CPU:  1
    RAM:  2 GB
    Disk: 4 GB

Algorithm

Bin-packing is an NP-hard problem. For that reason, binp does not try to find the optimal solution. Instead it tries to find a close-to-optimal solution that is good enough.

It uses a variation of the first-fit descending algorithm, sorting services from largest to smallest with a scoring function, and then distributing services to each machine in rotation.

This approach finds solutions which distribute the same program on multiple machines whereever possible. This is especially useful for high availability environments where 1 (or several) VM failures should not result in an outage.

binp's solution is deterministic, so running multiple times given the same input will always yield the same output.