◷ 30 min read

✏ published 2020-06-05

- *Abstract*

- *Example*

- *Setup*

- *Caveat*

- *Defining the Run-Length Encoding of Whitespace, W*

- *Converting Slice Permutations to DNF Expression*

- *Converting Expressions to CNF*

- *Implementation, Considerations*

Picross puzzles (also known as Nonograms) are a logic puzzle in the vein of Sudoku. A puzzle consists of a set of restrictions on each row and column of a grid.

For a restriction like `[2 8]`

, the slice with that restriction must contain 2 continuous cells, a separation gap of at least 1 cell, and 8 continuous cells; with any amount of padding on either end. These restrictions are the only information needed to uniquely encode this image! I think this is crazy, isn't this such a small amount of information needed to reconstruct this image? Maybe this could this be used for data compression! Well, probably not: as solving Picross puzzles has been proven NP-complete (Ueda, Nagao).

The solution being an array of Booleans, and solving being an NP-complete problem: I immediately thought about applying a SAT solver. A SAT solver is tailored to solve the **Boolean Satisfiability Problem** (**SAT** for short) as efficiently as possible: given a Boolean formula w like:

`w = (a ∨ ¬b ∨ c) ∧ (¬a ∨ b ∨ c)`

can you set the input variables a, b, and c to be either true or false, such that the w is true? This may seem like a somewhat random problem, but this is the first known NP-complete problem (Cook, Levin), and there are many SAT solvers that are designed to find a set of input variables satisfying a given Boolean formula as efficiently as possible.

Here, I'll describe an algorithm to convert a Picross puzzle into an equivalent SAT problem.

Using an extremely simple puzzle as an example:

```
2 1
+---+---+
2 | a | b |
+---+---+
1 | c | d |
+---+---+
```

If you understand the rules of Picross, this is solvable almost at sight: however, this is a good example of the process used to encode a puzzle as an equivalent SAT problem.

Given the restriction `[1]`

in a space of 2, there exists two possible permutations of cells: calling the state of these two cells **x** and **y**, the state of this slice w can be expressed as

`(x ∧ ¬y) ∨ (¬x ∧ y)`

if the restriction were `[2]`

in the space of 2, there's only one option:

`x ∧ y`

These are the only two restrictions used in the given example. For the whole board to be satisfied, all of the conditions of each slice must be satisfied:

```
[a ∧ b] (row 1)
∧ [(c ∧ ¬d) ∨ (¬c ∧ d)] (row 2)
∧ [a ∧ c] (column 1)
∧ [(b ∧ ¬d) ∨ (¬b ∧ d)] (column 2)
```

This expression sufficiently encodes the example as a Boolean expression! If you don't care about generalizations, you can go home now.

Wow, you stuck around! For reasons that will be explained later, SAT solvers expect an expression in the form of `AND`

of `OR`

s. This means slices like

`(c ∧ ¬d) ∨ (¬c ∧ d)`

which are logical xor, which rewritten as **AND** of **OR**s:

```
(c ∧ ¬d) ∨ (¬c ∧ d)
=>
(c ∨ d) ∧ (¬c ∨ ¬d)
```

with this re-write,

`E = [a ∧ b] ∧ [(c ∨ d) ∧ (¬c ∨ ¬d)] ∧ [a ∧ c] ∧ [(b ∨ d) ∧ (¬b ∨ ¬d)]`

Small change, but this is an equation you could go plug into a SAT solver. The reason this change is necessary will be explained as the algorithm is developed.

You'll notice there's certainly some redundancy here that is possible to optimize away (a ∧ a = a); but this direct encoding without optimization reflects the original structure of the encoded puzzle closely.

The translation algorithm of a Picross puzzle to an equivalent SAT problem has three distinct steps:

- For each restriction, generate all permutations of the run-length encoding of the whitespace.
- Express each set of permutations as a DNF expression.
- Convert all DNF expressions into a CNF expression.

This is something that you naturally do when solving puzzles by hands, but here formalized.

The **run-length encoding** of the **whitespace** is a method of expressing solutions to slices in a more compact form. For example, just consider one row with size 5 and restriction

```
...
1 2 |?????|
...
```

the possible solutions,

```
1 2 |X--XX|
1 2 |X-XX-|
1 2 |-X-XX|
```

See how there's three spots there can be a 'gap' (whitespace) around the Xs? There can be a gap to the left of the lone X, a gap between the lone X and the dual Xs, and a gap to the right of the dual Xs. Expressing the individual sizes of each of these whitespaces, the **run-length encoding**,

```
1 2 |X--XX| --> [0, 2, 0] 0 on left, 2 between, 0 on right
1 2 |X-XX-| --> [0, 1, 1] 0 on left, 1 between, 1 on right
1 2 |-X-XX| --> [1, 1, 0] 1 on left, 1 between, 0 on right
```

From this some invariants are clear. For any of of these run-length encoded whitespace vectors **W** with its corresponding restriction vector **R** in a slice of size **n**,

- |W| = |R| + 1
- |W| + |R| = n
- W(0) ≥ 0, W(|W|) ≥ 0
- W(1 .. |W|-1) > 0

each of these having a more informal interpretation, respectively:

- There's one more group of whitespaces than groups of Xs. For the above example,
`len([0, 2, 0]) = len([1, 2]) + 1`

which is invariant of the**W**picked. Think of fenceposts. - The sum of Xs and whitespace is the length of the slice.
`len([1, 2]) + len([0, 2, 0]) = 5`

, every cell is occupied by an X or whitespace. - Any middle element of
**W**must be strictly positive (non-zero). If a middle element was zero, then there would not be a gap between the Xs, violating the restrictions imposed by**R**. - The first and last elements of
**W**are strictly positive, or zero. This is a special case of the previous restriction, given that there might not be whitespace at each end.

With those invariants defined, we now want a function **f** that maps a restriction vector **R** and slice length **n** to all the permutations of **W**.

For the above example, the example input and output,

`f(R=[1, 2], n=5) = {[0,2,0], [0,1,1], [1,1,0]}`

This is a fun little puzzle, I'd encourage you to try it out on paper and formalize in your trusty programming language. Here's a pretty shit C++ solution,

```
vector<vector<int>>
sumPermute(int n, int h, int w)
{
vector<int> inner;
vector<vector<int>> result;
for (int ii = ceil(n/w); ii <= h ; ii++) {
inner.clear();
inner.push_back(ii);
if (n - ii == 0) {
result.push_back(inner);
return result;
}
vector<vector<int>> recur = sumPermute(n -ii, ii, w - 1);
for (auto& nn : recur) {
// append each element of recur to inner, add to results
vector<int> sum;
sum.reserve(inner.size() + nn.size());
sum.insert(sum.end(), nn.begin(), nn.end());
sum.insert(sum.end(), inner.begin(), inner.end());
result.push_back(sum);
}
}
return result;
}
```

At this point, for a given slice S, we have generated all valid permutations of W. To recover P the set of all valid slice permutations from each W, we do the encoding in 'reverse',

- push back W(0) zeros
- push back R(0) ones
- ...
- push back R(|R|) ones
- push back W(|W|) zeros

When repeated for each valid permutation of W, the resulting set is P.

To convert P to a set of Boolean expressions, an encode each board cell as a unique integer with row-major ordering,

`B_{x,y} = x + dimX × y`

To convert each element of P to a corresponding Boolean expression, iterate over each permutation encoding ¬B(i,j) if a zero is read and B(i,j) if a one is read. For one instance of a permutation, all of these literals are `AND`

ed, and all permutations are `OR`

ed for a resulting expression of every possibility for the given slice.

This final expression is in **Disjunctive Normal Form** (`OR`

of `AND`

s, or Sum of Products):

`∨_{i=0}^m ( ∧_{j=0}^n B_{i,j} )`

This completes the second step of the translation algorithm: converting a given slice into a DNF expression.

At this point, we have a set of DNF expressions that satisfy each slice of the board. To satisfy the entire board, all of these expressions must be satisfied: `AND`

ing over all slices w gives a total expression for the board w:

`E = ∧_{S} [ ∨_{i=0}^m ( ∧_{j=0}^n B^S_{i,j} ) ]`

However, SAT solvers reasonably expect a standard form for input expressions. **Conjunctive Normal Form** is the form SAT solvers use to represent a Boolean formula (see this thread for explanations as to why),

`∧_{i=0}^n ( ∨_{j=0}^m x_{ij} )`

So the final step of the translation algorithm is the conversion of w into CNF. Well, we're already pretty close with the outermost `AND`

; in fact, just translating the inner DNF expression to CNF is sufficient,

`∧ [ ∨ ( ∧ x ) ] ---> ∧ [ ∧ ( ∨ x ) ] == ∧ ( ∨ x )`

Converting DNF expressions to CNF expressions is another great exponential translation! I believe this algorithm is, formally speaking, *O*(terrible). Using w as a shorthand for a sequence of `AND`

ed literals, writing an expanded form of w:

```
E = (a_{11} ∨ a_{12} ∨ ... ∨ a_{1n})
∧ (a_{21} ∨ a_{22} ∨ ... ∨ a_{2n})
∧ ...
∧ (a_{m1} ∨ a_{m2} ∨ ... ∨ a_{mn})
```

Converting to a CNF expression is once again making every permutation of sets. Using an example,

`E = (x_{11} ∧ x_{12} ∧ x_{13}) ∨ (x_{21} ∧ x_{22} ∧ x_{23})`

then

```
E = (x_{11} ∨ x_{21}) ∧ (x_{11} ∨ x_{22}) ∧ (x_{11} ∨ x_{23})
∧ (x_{12} ∨ x_{21}) ∧ (x_{12} ∨ x_{22}) ∧ (x_{12} ∨ x_{23})
∧ (x_{13} ∨ x_{21}) ∧ (x_{13} ∨ x_{22}) ∧ (x_{13} ∨ x_{23})
```

This is **very** exponential, and the general form is quite large: using |a| = l,

```
E = (a_{111} ∨ a_{211} ∨ ... ∨ a_{m11}) ∧ ... ∧ (a_{11l} ∨ a_{211} ∨ ... ∨ a_{m11}) ∧
(a_{111} ∨ a_{212} ∨ ... ∨ a_{m11}) ∧ ... ∧ (a_{11l} ∨ a_{212} ∨ ... ∨ a_{m11}) ∧
⋮ ⋮
(a_{111} ∨ a_{21l} ∨ ... ∨ a_{m11}) ∧ ... ∧ (a_{11l} ∨ a_{21l} ∨ ... ∨ a_{m11}) ∧
⋮ ⋮
(a_{111} ∨ a_{21l} ∨ ... ∨ a_{m1l}) ∧ ... ∧ (a_{11l} ∨ a_{21l} ∨ ... ∨ a_{m1l}) ∧
⋮ ⋮
(a_{1n1} ∨ a_{2nl} ∨ ... ∨ a_{mnl}) ∧ ... ∧ (a_{1nl} ∨ a_{2nl} ∨ ... ∨ a_{mnl})
```

The number of expressions grows as *O*(l^m). However, with this final conversion, the translation algorithm is complete! Any Picross puzzle can be converted to an equivalent, SAT solver ready, CNF expression using this algorithm.

The complete code for `PicrossSAT`

can be found here. It's bad, don't look at it.