Friday, December 14, 2018

Of Typewriters and Permutations (III)

This continues the discussion (okay, monologue) from the two previous posts about the problem of laying out a one-dimensional typewriter keyboard. This is not the last post in the series, but I can at least guarantee that the series is converging.

In the previous post, I gave a MIP model (named MIP1) that used binary variables $x_{ij}$ to signal whether or not symbol $i$ was placed at position $j$ on the keyboard. It also used auxiliary variables $p_i$ for the position of symbol $i$ and $d_{ij}$ for the distance between symbols $i$ and $j$.

You might wonder whether those auxiliary variables are worth having. I did, after the somewhat disappointing (to me) lack of progress in the lower bound when I ran MIP1. So I omitted the position variables in a new model, denoted MIP2 in the code. (Reminder: My Java code is available and free to play with.) I kept the distance variables because they make it easier to define the objective function.

MIP2 contains the $x$ and $d$ variables from MIP1, along with the constraints that rows and columns of the $x$ matrix sum to 1. It also retains the objective function from MIP1. To tie $d_{ij}$ to $x_{i\cdot}$ and $x_{j\cdot}$, MIP2 contains the following constraints:$$d_{ij} \ge |m-k|\left(x_{ik} + x_{jm} -1\right)\quad \forall i\neq j, \forall k\neq m.$$These constraints can be entered as "normal" constraints or, using an option in the command line, as lazy constraints (meaning they'll be held off to side and activated whenever a node LP solution violates one of them). Since the distances are symmetric, we actually need only about half of these. Rather than risking an indexing error in my code, I added constraints of the form$$d_{ij}=d_{ji}\quad \forall i, \forall j<i$$and let the CPLEX presolver take care of eliminating the redundant constraints for me.

Was MIP2 more efficient than MIP1? Nope, it was worse ... much worse. In a five minute run, the lower bound got stuck at zero, leaving the gap at 100%. (With the cut Rob Pratt suggested in a comment to the previous post, the lower bound got off zero but froze at 192,550, so that the gap was "only" 97.13% by the end of five minutes.

Having no luck with MIP2, I went back to MIP1. In an effort to improve its performance, I made one tweak: rather than putting the branching priorities on the assignment variables ($x$), I declared the position variables ($p$) to be integer (in MIP1 they were continuous) and put the branching priorities on them. Thus higher priority was given to pinning down the positions of higher usage characters. This model is designated "MIP3" in the code. It did a fair bit worse than MIP1.

As mentioned in the post by Nate Brixius (and in my first post), the underlying problem looks like a quadratic assignment problem (QAP). So I tried a model with a quadratic objective function (identified as "QP" in the code). It contains only the $x$ variables and row/column sum constraints from MIP1. The objective function is$$\min \sum_i \sum_{j\neq i} \sum_k \sum_{m\neq k} f_{ij} |k - m| x_{ik} x_{jm}.$$This model had a better incumbent than MIP2 but a worse incumbent than either MIP1 or MIP3, and the lower bound was stuck at zero. (Since there are no distance variables, Rob Pratt's cut is not applicable here.)

Here is a summary of results after five minutes, using emphasis 2 (optimality), branching priorities, and Rob's cut:

Model Incumbent Bound Gap
MIP1 5,686,878 2,970,708 47.76%
MIP2 6,708,297 192,550 97.13%
MIP3 5,893,907 1,503,828 74.49%
QP 6,056,668 0 100.00%

I'm not quite done. There's one more model to try, coming up in what will hopefully be my last post on this.


  1. Another option is to replace the inequality formulation of absolute value with the equality formulation: $p_i - p_j = d^+_{ij} - d^-_{ij}$, where $d^+$ and $d^-$ are nonnegative, and the objective is to minimize $\sum_{i < j} (f_{ij} + f_{ji}) (d^+_{ij} + d^-_{ij})$.

    1. That seems like a good idea on the surface, but I just tried it, and for some reason it slowed down progress on the bound. Partly this was fewer nodes being investigated per minute (perhaps because the model gets bigger). Even at the same node count, though, the bound was weaker than without this. Beats me why.


Due to recent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Mathematics Stack Exchange.