◷ 30 min read

✏ published 2020-06-05

Jack Leightcap


- Abstract

- Example

- Setup

- Caveat

- Translation Algorithm

- Defining the Run-Length Encoding of Whitespace, W

- Generating Permutation of W

- Converting Slice Permutations to DNF Expression

- Converting Expressions to CNF

- Implementation, Considerations


Picross puzzles (also known as Nonograms) are a logic puzzle in the vein of Sudoku. A puzzle consists of a set of restrictions on each row and column of a grid.

For a restriction like [2 8], the slice with that restriction must contain 2 continuous cells, a separation gap of at least 1 cell, and 8 continuous cells; with any amount of padding on either end. These restrictions are the only information needed to uniquely encode this image! I think this is crazy, isn't this such a small amount of information needed to reconstruct this image? Maybe this could this be used for data compression! Well, probably not: as solving Picross puzzles has been proven NP-complete (Ueda, Nagao).

The solution being an array of Booleans, and solving being an NP-complete problem: I immediately thought about applying a SAT solver. A SAT solver is tailored to solve the Boolean Satisfiability Problem (SAT for short) as efficiently as possible: given a Boolean formula w like:

w = (a ∨ ¬b ∨ c) ∧ (¬a ∨ b ∨ c)

can you set the input variables a, b, and c to be either true or false, such that the w is true? This may seem like a somewhat random problem, but this is the first known NP-complete problem (Cook, Levin), and there are many SAT solvers that are designed to find a set of input variables satisfying a given Boolean formula as efficiently as possible.

Here, I'll describe an algorithm to convert a Picross puzzle into an equivalent SAT problem.



Using an extremely simple puzzle as an example:

    2   1
2 | a | b |
1 | c | d |

If you understand the rules of Picross, this is solvable almost at sight: however, this is a good example of the process used to encode a puzzle as an equivalent SAT problem.

Given the restriction [1] in a space of 2, there exists two possible permutations of cells: calling the state of these two cells x and y, the state of this slice w can be expressed as

(x ∧ ¬y) ∨ (¬x ∧ y)

if the restriction were [2] in the space of 2, there's only one option:

x ∧ y

These are the only two restrictions used in the given example. For the whole board to be satisfied, all of the conditions of each slice must be satisfied:

  [a ∧ b]                (row 1)
∧ [(c ∧ ¬d) ∨ (¬c ∧ d)]  (row 2)
∧ [a ∧ c]                (column 1)
∧ [(b ∧ ¬d) ∨ (¬b ∧ d)]  (column 2)

This expression sufficiently encodes the example as a Boolean expression! If you don't care about generalizations, you can go home now.


Wow, you stuck around! For reasons that will be explained later, SAT solvers expect an expression in the form of AND of ORs. This means slices like

(c ∧ ¬d) ∨ (¬c ∧ d)

which are logical xor, which rewritten as AND of ORs:

(c ∧ ¬d) ∨ (¬c ∧ d)
    (c ∨ d) ∧ (¬c ∨ ¬d)

with this re-write,

E = [a ∧ b] ∧ [(c ∨  d) ∧ (¬c ∨ ¬d)] ∧ [a ∧ c] ∧ [(b ∨ d) ∧ (¬b ∨ ¬d)]

Small change, but this is an equation you could go plug into a SAT solver. The reason this change is necessary will be explained as the algorithm is developed.

You'll notice there's certainly some redundancy here that is possible to optimize away (a ∧ a = a); but this direct encoding without optimization reflects the original structure of the encoded puzzle closely.

Translation Algorithm

The translation algorithm of a Picross puzzle to an equivalent SAT problem has three distinct steps:

Defining the Run-Length Encoding of Whitespace, W

This is something that you naturally do when solving puzzles by hands, but here formalized.

The run-length encoding of the whitespace is a method of expressing solutions to slices in a more compact form. For example, just consider one row with size 5 and restriction

1 2 |?????|

the possible solutions,

1 2 |X--XX|

1 2 |X-XX-|

1 2 |-X-XX|

See how there's three spots there can be a 'gap' (whitespace) around the Xs? There can be a gap to the left of the lone X, a gap between the lone X and the dual Xs, and a gap to the right of the dual Xs. Expressing the individual sizes of each of these whitespaces, the run-length encoding,

1 2 |X--XX|    --> [0, 2, 0]  0 on left, 2 between, 0 on right

1 2 |X-XX-|    --> [0, 1, 1]  0 on left, 1 between, 1 on right

1 2 |-X-XX|    --> [1, 1, 0]  1 on left, 1 between, 0 on right

From this some invariants are clear. For any of of these run-length encoded whitespace vectors W with its corresponding restriction vector R in a slice of size n,

each of these having a more informal interpretation, respectively:

Generating Permutation of W

With those invariants defined, we now want a function f that maps a restriction vector R and slice length n to all the permutations of W.

For the above example, the example input and output,

f(R=[1, 2], n=5) = {[0,2,0], [0,1,1], [1,1,0]}

This is a fun little puzzle, I'd encourage you to try it out on paper and formalize in your trusty programming language. Here's a pretty shit C++ solution,

sumPermute(int n, int h, int w)
    vector<int> inner;
    vector<vector<int>> result;

    for (int ii = ceil(n/w); ii <= h ; ii++) {
        if (n - ii == 0) {
            return result;
        vector<vector<int>> recur = sumPermute(n -ii, ii, w - 1);
        for (auto& nn : recur) {
            // append each element of recur to inner, add to results
            vector<int> sum;
            sum.reserve(inner.size() + nn.size());
            sum.insert(sum.end(), nn.begin(), nn.end());
            sum.insert(sum.end(), inner.begin(), inner.end());
    return result;

Converting Slice Permutations to DNF Expression

At this point, for a given slice S, we have generated all valid permutations of W. To recover P the set of all valid slice permutations from each W, we do the encoding in 'reverse',

When repeated for each valid permutation of W, the resulting set is P.

To convert P to a set of Boolean expressions, an encode each board cell as a unique integer with row-major ordering,

B_{x,y} = x + dimX × y

To convert each element of P to a corresponding Boolean expression, iterate over each permutation encoding ¬B(i,j) if a zero is read and B(i,j) if a one is read. For one instance of a permutation, all of these literals are ANDed, and all permutations are ORed for a resulting expression of every possibility for the given slice.

This final expression is in Disjunctive Normal Form (OR of ANDs, or Sum of Products):

∨_{i=0}^m ( ∧_{j=0}^n B_{i,j} )

This completes the second step of the translation algorithm: converting a given slice into a DNF expression.

Converting Expressions to CNF

At this point, we have a set of DNF expressions that satisfy each slice of the board. To satisfy the entire board, all of these expressions must be satisfied: ANDing over all slices w gives a total expression for the board w:

E = ∧_{S} [ ∨_{i=0}^m ( ∧_{j=0}^n B^S_{i,j} ) ]

However, SAT solvers reasonably expect a standard form for input expressions. Conjunctive Normal Form is the form SAT solvers use to represent a Boolean formula (see this thread for explanations as to why),

∧_{i=0}^n ( ∨_{j=0}^m x_{ij} )

So the final step of the translation algorithm is the conversion of w into CNF. Well, we're already pretty close with the outermost AND; in fact, just translating the inner DNF expression to CNF is sufficient,

∧ [ ∨ ( ∧ x ) ]   ---> ∧ [ ∧ ( ∨ x ) ] == ∧ ( ∨ x )

Converting DNF expressions to CNF expressions is another great exponential translation! I believe this algorithm is, formally speaking, O(terrible). Using w as a shorthand for a sequence of ANDed literals, writing an expanded form of w:

E =   (a_{11} ∨ a_{12} ∨  ... ∨ a_{1n})
    ∧ (a_{21} ∨ a_{22} ∨  ... ∨ a_{2n})
    ∧ ...
    ∧ (a_{m1} ∨ a_{m2} ∨  ... ∨ a_{mn})

Converting to a CNF expression is once again making every permutation of sets. Using an example,

E = (x_{11} ∧ x_{12} ∧ x_{13}) ∨  (x_{21} ∧ x_{22} ∧ x_{23})


E =   (x_{11} ∨ x_{21}) ∧ (x_{11} ∨ x_{22}) ∧ (x_{11} ∨ x_{23})
    ∧ (x_{12} ∨ x_{21}) ∧ (x_{12} ∨ x_{22}) ∧ (x_{12} ∨ x_{23})
    ∧ (x_{13} ∨ x_{21}) ∧ (x_{13} ∨ x_{22}) ∧ (x_{13} ∨ x_{23})

This is very exponential, and the general form is quite large: using |a| = l,

E =   (a_{111} ∨ a_{211} ∨ ... ∨ a_{m11}) ∧ ... ∧ (a_{11l} ∨ a_{211} ∨ ... ∨ a_{m11}) ∧
      (a_{111} ∨ a_{212} ∨ ... ∨ a_{m11}) ∧ ... ∧ (a_{11l} ∨ a_{212} ∨ ... ∨ a_{m11}) ∧
                            ⋮                                           ⋮
      (a_{111} ∨ a_{21l} ∨ ... ∨ a_{m11}) ∧ ... ∧ (a_{11l} ∨ a_{21l} ∨ ... ∨ a_{m11}) ∧
                            ⋮                                           ⋮
      (a_{111} ∨ a_{21l} ∨ ... ∨ a_{m1l}) ∧ ... ∧ (a_{11l} ∨ a_{21l} ∨ ... ∨ a_{m1l}) ∧
                            ⋮                                           ⋮
      (a_{1n1} ∨ a_{2nl} ∨ ... ∨ a_{mnl}) ∧ ... ∧ (a_{1nl} ∨ a_{2nl} ∨ ... ∨ a_{mnl})

The number of expressions grows as O(l^m). However, with this final conversion, the translation algorithm is complete! Any Picross puzzle can be converted to an equivalent, SAT solver ready, CNF expression using this algorithm.

Implementation, Considerations

The complete code for PicrossSAT can be found here. It's bad, don't look at it.