A question posed on the OR Discord channel by a doctoral student led me to discover the existence of the Jane Street puzzle page. The student was asking about building a MILP model for the June 2022 puzzle, called "Block Party 4". The puzzle involves inserting numbers into a grid, with some cells already filled in. It bears a superficial resemblance to sudoku, but with a few key differences. Where a sudoku is divided into nine square regions of nine cells each, the block party puzzle grid is divided into connected regions of varying sizes and shapes. Within a region of $k$ cells, the numbers 1 through $k$ must be filled in. Finally, rather than requiring that rows and columns contain no repeated numbers, the rules require that, for each possible value $K$, if $K$ is inserted into a cell then the nearest instance of $K$ must be at distance exactly $K$ in the $L_1$ norm. So to use a 1, there must be a 1 in an adjacent cell. To use a 2, there must be a 2 in a cell two moves away but no 2 in any adjacent cell.
Since this is a problem with logic constraints and integer decisions, my instinct was to think that constraint programming would be faster than integer programming. To test this, I coded both an IP model and a CP model in Java, using CPLEX and CP Optimizer as the respective solvers. I assumed that the grid would be square, since both the June puzzle (10 x 10) and a smaller example provided (5 x 5) were. Both models can easily be adjusted for non-square grids.
Assume an $N\times N$ grid partitioned into regions, and let $M$ be the size of the largest region (and thus the largest value that can be used in the puzzle). Number the cells 1 through $N^2$ in any order. (I used a left-to-right raster scan.) For the IP model, I use binary variables $x_{i,j}$ $(i=1,\dots,N^2$, $j=1,\dots,M)$ to indicate whether value $j$ is inserted into cell $i$. For cells with known values, I fix $x_{i,j}$ to either 0 or 1 as appropriate while building the model. Also, if cell $i$ lies in a region of size $K$, then I can fix $x_{i,j}=0$ for $j>K.$
Since we just want a feasible solution, I let the IP objective function default to minimizing zero. The most obvious constraint is $$\sum_{j=1}^M x_{i,j} = 1 \quad \forall i,$$which forces a single value to be selected for each cell. Similarly, if $B$ is a block with size $K,$ then $$\sum_{i\in B}x_{i,j}=1 \quad \forall j=1,\dots,K$$forces every value from 1 to $K$ to be used exactly once in the block. Finally, for each block $B,$ each cell $i\in B$ and each legal value $j\in \lbrace 1, \dots, \vert B\vert\rbrace$ for that cell, we add these constraints: $$x_{i,j} \le \sum_{k\in N_j(i)} x_{k,j} $$ and $$x_{i,j} + x_{k,j} \le 1\quad \forall k\in N^-_j(i),$$ where $N_j(i)$ is the set of all cells at distance exactly $j$ from cell $i$ and $N^-_j(i)$ is the set of all cells at distance less than $j$ from cell $i$ (excluding cell $i$ itself). These enforce the rule that, for value $j$ to be used in cell $i,$, it must also be used in at least one cell at distance $j$ from $i$ and in no closer cell.
The CP model is a bit more straightforward to articulate. Again, there is no objective function, since we are just solving for a feasible solution. For each cell $i$, there is a single integer variable $x_i$ with domain $1,\dots,\vert B \vert$ where $B$ is the block containing cell $i.$ If we know that cell $i$ is fixed to value $k,$ we just declare $x_i$ to have domain $\lbrace k \rbrace.$ For each block, we use an "all different" constraint to enforce the requirement that the cells in the block take distinct values. For each cell $i$ and legal value $j$ for it, the implication constraint $$(x_i = j) \implies \bigvee_{k\in N_j(i)} (x_k = j)$$ where $\bigvee$ denotes disjunction ("or"), forces at least one cell at distance $j$ from $i$ to take value $j$ if cell $i$ does, while the constraints $$(x_i = j) \implies (x_k \neq j) \quad \forall k\in N^-_j(i)$$ prohibit any closer cell from using that value. (These constraints could be condensed into a conjunction on the right hand side. For reasons I have since forgotten, I did not bother to do so.)
Both models solved the 10x10 puzzle easily. My expectation was that the CP model would be faster, for several reasons. First, it has 100 general integer variables, whereas the IP model started out with 1,100 binary variables (which the presolver whittled down to 119 binary variables, compared to 90 variables for the CP model after presolving). Second, the "all different" CP constraint seems to be a more efficient way than a steaming pile of inequality constraints to enforce the rule that no two cells in the same block take the same value. Third, CP Optimizer would be doing integer arithmetic while CPLEX would be doing double precision arithmetic, and on a per-operation basis integer arithmetic should be faster. Lastly, my experience in the past has been that the one edge IP models tend to have over CP models is tighter bounds, but that has no effect in a feasibility problem (when you are not optimizing anything).
As it turns out, I was in for a surprise. Actually, make that two surprises. First, the IP model after presolving had 211 constraints, whereas the CP model after presolving had 7,399 constraints. Note that, in the implication constraints, the left side and each equality on the right side count as a constraint. I'm not sure how comparable constraints are between the two models, but I was not expecting the CP model to have so many more. Second, while both model solved in negligible time, the IP model was faster. CPLEX solved the IP model at the root node (no branching) in about 0.01 seconds on my fairly average desktop PC. CP Optimizer needed 2,678 branches and about 0.12 seconds to solve the CP model, of which 0.05 seconds was spent in the "engine" (i.e., solving) and the rest was spent in "extraction" (turning the model into something suitable for the engine).
My Java code (which requires both CPLEX and CP Optimizer but nothing else) can be found in my GitLab repository.
No comments:
Post a Comment
Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.