OR in an OB World: Xpress

Showing posts with label Xpress. Show all posts

Thursday, September 11, 2025

Building a Partial Tour

A user on Mathematics Stack Exchange posted a question that, while superficially similar to the prize collecting traveling salesman problem (TSP), also has substantial differences to it. You are given a rectangular grid where each cell contains a reward. The objective is to build a tour of some (but not all) cells, with movement restricted to adjacent cells (those sharing an edge with the current cell), subject to the restriction that you cannot visit two cells with identical rewards. The objective is to maximize the value of the cells visited. It differs from the prize collecting TSP in several ways:

there is no designated start/end cell for the tour;
there are no edge costs (movement incurs no objective penalty);
there is no penalty for unvisited cells; and of course
duplicate rewards are prohibited.

A key part of the user's question was how to handle subtour elimination. The user was looking at the classic Miller-Tucker-Zemlin (MTZ) approach for TSPs, which involves an auxiliary variable at each cell recording in essence the position of the cell (first, second, ...) in the tour. The implementation of MTZ involves constraints at each cell except the depot (tour origin/destination). In the current problem, there is not designated depot, and you cannot just pick a cell to serve as the depot because you do not know which cells will be in the optimal tour.

The workaround I suggested in a reply to the post is to add an artificial cell (the "depot"), connected to every cell in the original grid, and use that to implement the MTZ constraints. The one mildly tricky part is that the cell immediately preceding the depot and the cell immediately following the depot in the tour need to be adjacent (or the same, if the tour only visits one cell). That is easily handled at the cost of some extra constraints.

Assume the grid contains $m$ rows and $n$ columns. Let us start by numbering the cells $1,\dots,mn$ in raster scan order from upper left to lower right, and call the depot cell 0. Let $r_i$ denote the reward for visiting cell $i$ and $N_i$ denote the set of cells (including the depot) adjacent to cell $i$. Finally, let $R$ denote the set of possible rewards and $C_r$ the "cluster" of cells having the same reward $r\in R.$ We build the model as follows.

Variables

$x_{ij}\in \{0,1\}$ will be 1 if the tour moves from cell $i$ to cell $j,$ 0 otherwise.

$y_{i} \in \{0,1\}$ will be 1 if the tour visits cell $i > 0,$ 0 if not.

$z_{i} \ge 0$ is the auxiliary variable (position) for cell $i$ in the MTZ constraints (indexing where cell $i$ falls in the tour).

If cell $i$ does not appear in the tour, the value of $z_{i}$ will be irrelevant (and the solver will likely set it to 0).

Objective

The objective is to maximize $$\sum_{i} r_i y_i.$$

Constraints

Movements may only occur between adjacent cells (with the depot cell adjacent to every other cell). $$x_{ij} = 0 \quad \forall i, \forall j\notin N_i.$$

The depot must be exited and entered exactly once. $$\sum_{i > 0} x_{0i} = 1 = \sum_{i > 0} x_{i0}.$$

Each cell other than the depot is entered and exited once if visited and zero times if not visited. $$\sum_{j} x_{ji} = y_i = \sum_{j} x_{ij} \quad \forall i>0.$$

At most one cell with a given reward value can be visited. $$\sum_{i\in C_r} y_i \le 1 \quad \forall r\in R.$$

MTZ: If the tour moves from cell $i$ to cell $j,$ the position value for cell $j$ must be at least one greater than the position value for cell $i$ (excluding $j=0,$ i.e., return to depot). $$x_{ij} = 1 \implies z_j \ge z_i + 1 \quad \forall i, \forall j > 0.$$

If the tour goes $0 \rightarrow i \rightarrow \dots \rightarrow j \rightarrow 0,$ then cells $i$ and $j$ must be adjacent (or the same). $$x_{0i} + x_{j0} \le 1 \quad \forall i>0, j>0, i\neq j, j \notin N_i.$$

I tested the code on the example in the Math SE post, using Java and the FICO XpressMP MIP solver, and it produced what appears to be an optimal tour (confirmed by the original poster) in about two minutes on my somewhat archaic desktop. I posted the tour in a comment to my answer to the original question. The Java code is available (Creative Commons 4 open source license) from my Git repository.

Sunday, November 10, 2024

Solver Parameters Matter

Modern integer programming solvers come with a gaggle of parameters the user can adjust. There are so many possible parameter combinations that vendors are taking a variety of approaches to taming the beast. The first, of course, is for vendors to set default values that work pretty well most of the time. This is particularly important since many users probably stick to default settings unless they absolutely have to start messing with the parameters. (By "many" I mean "myself and likely others".) The second is to provide a "tuner" that the user can invoke. The tuner experiments with a subset of possible parameter settings to try to find a good combination. Third, I've seen some discussion and I think some papers or conference presentations on using machine learning to predict useful settings based on characteristics of the problem. I am not sure how far along that research is and whether vendors are yet implementing any of it.

In the past few days I got a rather vivid reminder of how much a single parameter tweak can affect things. Erwin Kalvelagen did a blog post on sorting a numeric vector using a MIP model (with an up front disclaimer that it "is not, per se, very useful"). He test a couple of variants of a model on vectors of dimension 50, 100 and 200. I implemented the version of his model with the redundant constraint (which he found to speed things up) in Java using the current versions of both CPLEX and Xpress as solvers. The vector to sort was generated randomly with components distributed uniformly over the interval (0, 1). I tried a few random number seeds, and while times varied a bit, the results were quite consistent. Not wanting to devote too much time to this, I set a time limit of two minutes per solver run.

Using default parameters, both solvers handled the dimension 50 case, but CPLEX was about eight times faster. For dimension 100, CPLEX pulled off the sort in two seconds but Xpress still did not have a solution at the two minute mark. For dimension 200, CPLEX needed around 80 seconds and Xpress, unsurprisingly, struck out.

So CPLEX is faster than Xpress, right? Well, hang on a bit. On the advice of FICO's Daniel Junglas, I adjusted one of their parameters ("PreProbing") to a non-default value. This is one of a number of parameters that will cause the solver to spend more time heuristically hunting for a feasible or improved solution. Using my grandmother's adage "what's sauce for the goose is sauce for the gander," I tweaked an analogous parameter in CPLEX ("MIP.Strategy.Probe"). Sure enough, both solvers got faster on the problem (and Xpress was able to solve all three sizes), but the changes were more profound than that. On dimension 50, Xpress was between three and four times faster than CPLEX. On dimension 100, Xpress was again around four times faster. On dimension 200, Xpress won by a factor of slightly less than three.

So is Xpress actually faster than CPLEX on this problem? Maybe, maybe not. I only tweaked one parameter among several that could be pertinent. To me, at least, there is nothing about the problem that screams "you need more of this" or "you need less of that", other than the fact that the objective function is a constant (we are just looking for a feasible solution), which suggests that any changes designed to tighten the bound faster are likely to be unhelpful. I confess that I also lack a deep understanding of what most parameters do internally, although I have a pretty good grasp on the role of the time limit parameter.

So the reminder for me is that before concluding that one solver is better than another on a problem, or that a problem is too difficult for a particular solver, I need to put a bit of effort into investigating whether any parameter tweaks have substantial impact on performance.

Update: A post on the Solver Max blog answers a question I had (but did not investigate). With feasibility problems, any feasible solution is "optimal", so it is common to leave the objective as optimizing a constant (usually zero). Erwin and I both did that. The question that occurred to me (fleetingly) was whether an objective function could be crafted that would help the solver find a feasible solution faster. In this case, per the Solver Max post, the answer appears to be "yes".

Sunday, November 3, 2024

Xpress and RStudio

The following is probably specific to Linux systems. I recently installed FICO Xpress optimizer, which comes with an R library to provide an API for R code. FICO requires a license file (or a license server -- I went with a static file since I'm a single user) and adds an assortment of environment variable to the bash shell, including one pointing to the license file. So far, so good.

Xpress comes with example files, including example R scripts. So I cranked up RStudio, opened the simplest example ("first_lp_problem.R", which is just what it sounds like) and executed it line by line. The problem setup lines worked fine, but the first Xpress API call died with an error message saying it couldn't find the license file in directory "." (i.e., the current working directory). The same thing happened when I tried to source the file in the RStudio console.

To make a long story somewhat shorter, after assorted failed attempts to sort things out it occurred to me to run R in a terminal and source the example file there. That ran smoothly. So the problem was with RStudio, not with R. Specifically, it turns out that RStudio runs without loading any bash environment variables.

After assorted failed attempts at a fix (and pretty much wearing out Google), I found the following solution. In my home directory ("/home/paul", a.k.a. "~") I created a text file named ".Renviron". In it, I put the line "XPAUTH_PATH=/home/paul/.../xpauth.xpr", where "..." is a bunch of path info you don't need to know and "xpauth.xpr" is the name of the license file. If you already have a ".Renviron" file, you can just add this line to it. The example script now runs fine in RStudio. Note that there are a gaggle of other bash environment variables created by Xpress, none of which presumably are known to RStudio, but apparently the license file path is the only needed by the API (at least so far). If I trip over any other omissions later on, presumably I can add them to ".Renviron".