OR in an OB World: Finding "All" MIP Optima: The CPLEX Solution Pool

Thursday, January 24, 2013

Finding "All" MIP Optima: The CPLEX Solution Pool

Following up on my recent post about finding all optimal solutions (or at least multiple optimal solutions) to MIP models, it turns out that recent versions of CPLEX make this rather easy to program (if not necessarily quick to do) through their solution pool feature. I'll illustrate the technique with a simple example.

The problem

The example problem is to find $N$ binary vectors of dimension $D$ so as to maximize the minimum Hamming distance between them. One source of this problem is coding theory (not to be confused with cryptography): we might want a "vocabulary" of $N$ "words" that are as unlikely as possible to be confused with each other if individual bits are independently molested, with low likelihood, during transmission. The mathematical model is \[ \begin{array}{lrclr} \textrm{maximize} & & d\\ \textrm{s.t.} & \Vert x^{(j)}-x^{(k)}\Vert & \ge & d & \forall j,k\in\left\{ 1,\dots,N\right\} \ni j\lt k\\ & x^{(j)} & \in & \mathbb{B}^{D} & \forall j\in\{1,\dots,N\} \end{array} \]where$\Vert\cdot\Vert$ denotes the Hamming length (number of non-zeros) of a vector and $\mathbb{B}^D$ is the space of $D$-dimensional binary vectors.

The MILP model

To massage this into a mixed-integer linear program (MILP), we introduce the following auxiliary variables:

$y^{(j,k)}_i\in\mathbb{B}$ is 1 if and only if vectors $x^{(j)}$ and $x^{(k)}$ differ in component $i$;
$d^{(j,k)}$ is the Hamming distance between vectors $x^{(j)}$ and $x^{(k)}$; and
$d$ is the minimum Hamming distance between any pair of vectors (to be maximized).

The MILP model is:

\[ \begin{array}{lrclr} \textrm{maximize} & & d\\ \textrm{s.t.} & y_{i}^{(j,k)} & \le & x_{i}^{(j)}+x_{i}^{(k)} & \forall i\in\{1,\dots,D\};\forall j,k\in\left\{ 1,\dots,N\right\} \ni j\lt k\\ & y_{i}^{(j,k)} & \le & 2-x_{i}^{(j)}-x_{i}^{(k)} & \forall i\in\{1,\dots,D\};\forall j,k\in\left\{ 1,\dots,N\right\} \ni j\lt k\\ & d^{(j,k)} & = & \sum_{i=1}^{D}y_{i}^{(j,k)} & \forall j,k\in\left\{ 1,\dots,N\right\} \ni j\lt k\\ & d & \le & d^{(j,k)} & \forall j,k\in\left\{ 1,\dots,N\right\} \ni j\lt k\\ & x^{(j)} & \in & \mathbb{B}^{D} & \forall j\in\{1,\dots,N\}\\ & y^{(j,k)} & \in & \mathbb{B}^{D} & \forall j,k\in\left\{ 1,\dots,N\right\} \ni j\lt k\\ & d^{(j,k)} & \in & [0,D] & \forall j,k\in\left\{ 1,\dots,N\right\} \ni j\lt k\\ & d & \in & [0,D]. \end{array} \]

Symmetry

The model contains a high degree of symmetry, and as I've mentioned before symmetry will slow the search for multiple optima (and, at the same time, can be exploited to easily generate alternate optima from a given solution). I'll add some constraints to the MILP model to get rid of at least some of the symmetry.

First, consider any solution $x^{(1)},\dots,x^{(N)}$ to the model, and suppose I toggle bit $i$ in every vector; that is, define $\hat{x}^{(j)}$ for $j\in\{1,\dots,N\}$ by \[ \hat{x}_{h}^{(j)}=\begin{cases} x_{h}^{(j)} & h\neq i\\ 1-x_{i}^{(j)} & h=i \end{cases}. \] We obtain a new solution with $\Vert\hat{x}^{(j)}-\hat{x}^{(k)}\Vert=\Vert x^{(j)}-x^{(k)}\Vert\:\forall j\lt k$, so the new solution has the same objective value as the old one. Therefore, without loss of generality, I can require that $x^{(1)}=0$.

Second, if we permute the indices of the vectors (other than the first vector, which we have forced to be 0), we obtain another set of $N$ vectors with the same pairwise distances. To eliminate that, we will constrain the vectors to be lexicographically increasing; that is, if $x^{(j)}_h=x^{(j+1)}_h$ for $h\lt i$ and $x^{(j)}_i \neq x^{(j+1)}_i$, then $x^{(j)}_i=0$ and $x^{(j+1)}_i = 1$. Lexicographic ordering constraints can in general be a pain to write, but our auxiliary variables $y$ actually make it simple here:\[ x_{i}^{(j+1)}\ge x_{i}^{(j)}-\sum_{h\lt i}y_{h}^{(j,j+1)}. \]

Setting up the solution pool

Before invoking CPLEX's populate method to fill in the solution pool, we need to set some parameter values. Let's say that our ambition is to find $M$ optimal solutions if possible. (To find all optimal solutions, choose a really large value of $M$ and hope you live long enough to see the run terminate.) The first thing to do is to set the SolutionPoolCapacity parameter to $M$. Next, set the SolutionPoolIntensity parameter to 4, the most aggressive setting ("leave no stone unturned").

The PopulateLim parameter dictates the maximum number of integer-feasible solutions CPLEX will generate before it declares victory and retires from the battlefield. It may stop before reaching this limit because it has exhausted all possible solutions, or because no possible solutions remain that would satisfy the pool gap parameters (discussed below), or because the standard time limit or node limit is reached. The default value for this parameter is 20. You will want it at least $M$, but realistically you need to set it considerably larger than $M$, since suboptimal solutions found along the way count against the limit. Set it to something really, really large to be safe.

We want to screen out suboptimal solutions. Since the objective function is integer-valued, a suboptimal solution will be worse than the incumbent by at least 1. Allowing for some rounding error, we set the pool absolute gap parameter (SolnPoolAGap) to something between 0 and 1 (I'll use 0.5). Any solution with 0.5 of the incumbent will have the same minimum Hamming distance after cleaning up rounding error; any solution more than 0.5 worse than the incumbent has a minimum Hamming distance at least 1 greater and is thus infeasible. (I'm trusting that I won't get a rounding error worse than 0.5 in any objective value.) We will leave the relative pool gap parameter (SolnPoolGap) at its default value of $10^{75}$.

The gap parameter prevents CPLEX from adding solutions to the pool that are inferior to the best solution already in the pool. It does not prevent suboptimal solutions from being added to the pool before the optimum is found, nor does it force them to be dropped from the pool once better solutions are found. The documentation may be a bit unclear on the latter point; in the sections on the gap parameters, it says that inferior solutions will be discarded but does not explicitly limit that statement to new solutions. I've verified by experiment, though, that inferior solutions found prior to the first truly optimal incumbent may be retained.

The (partial) cure for this is to adjust the pool replacement strategy (SolnPoolReplace). This parameter tells CPLEX how to make room when the pool is full and a new integer-feasible solution is found. Note that, by virtue of our choice of SolnPoolAGap value, the newly found solution will be at least as good as the best of the solutions in the pool. The default value (CPX_SOLNPOOL_FIFO = 0) uses a first in, first out replacement strategy, without regard to objective value. What we want is CPX_SOLNPOOL_OBJ = 1, which kicks out the solution with the worst objective value to make room for the new and improved solution. (If you want to find lots of feasible but not necessarily optimal solutions, you might want CPX_SOLNPOOL_DIV = 2, which kicks out solutions so as to enhance the diversity of the pool.)

Are we done yet???

Now call the populate method (whether in an API or the interactive optimizer), sit back, and wait for the results. When CPLEX finishes, the pool is full of optimal solutions, right? Not quite. If there are fewer than $M$ optimal solutions to your model, chances are quite high that some suboptimal solutions will have made it into the pool. Since the pool never filled up, no solutions were kicked out, so they'll still be sitting there. You will need to check the objective values of each solution in the pool, find the best of those values, and use it to weed out the suboptimal solutions.

You are welcome to see my Java source code for this problem. I wrote it for Java 7, but I believe it will work with Java 6 if you edit one line.

45 comments:

UnknownJanuary 28, 2013 at 10:23 AM
"Look what's in the CPLEX solution pool and see what can be done with it" is on my to-do list for 2013. Besides being interesting to read, your post will probably help me to save time in the near future! Thank you for taking the time to write this stuff.
ReplyDelete
Replies
NirajJanuary 6, 2014 at 2:38 PM
Hi Paul,
Thanks again, Yes I have followed all the instructions in the manual. The thing which surprised me was in one case when I removed a set of constraints the total number of solution reported by cplex.populate() method reduced. Also in another case when I set the solution pool intensity parameter to 4 the number of solution reported in solution pool is reduced, which shouldn't be the case as per the definition in the manual.
I was asking you that how does cplex.populate() work. Does it store all the integer incumbent solution prior to getting the optimal solution and report it in the solution pool, or does it also looks into the nodes which were fathomed cause of the bounds while finding the optimal solution and give other feasible solutions.
ReplyDelete
Replies
NirajJanuary 13, 2014 at 3:53 PM
Paul,
Thanks once again. So what I understand from what you are saying is that may be cplex.populate() method returns only "integer-feasible corner point solutions". That may be the reason that total number of solution may decrease if we decrease the number of constraint in some case.
Please rectify me if I am wrong.

Niraj
ReplyDelete
Replies
Paul A. RubinJanuary 13, 2014 at 4:48 PM
No, populate() returns integer-feasible solutions, regardless of whether they are corner points or not. I was just remarking that removing a constraint changes the corner points, so it can alter the number of integer corners in either direction (up or down), but it cannot decrease the set of integer-feasible solutions (corner and interior combined).
ReplyDelete
Replies
NirajJanuary 16, 2014 at 11:22 AM
Paul,
Here is something interesting that I observed about the populate method, and thought of sharing with you. I was solving a small example (which I have written below) to check if the populate method works as it is supposed to. This problem has eight feasible solutions but when I used populate method, no matter what parameter I use, it returns only one solution. But for the same problem when I changed the sense of objective (x1+x2) as maximize it gave me all 8 feasible solutions. I tried another setting in which the sense of objective was maximize but changed the objective to (-x1-x2), it again returned me one solution. With the same objective (-x1-x2) I changed the sense of objective to minimize and it returned all the 8 feasible solutions. I would like to know your thoughts on this.
Minimize
obj: x1 + x2
Subject To
c1: 3 x1 + x2 <= 6
c2: x1 + 2 x2 <= 6
Bounds
x1 >= 0
x2 >= 0
Generals
x1 x2
End
ReplyDelete
Replies
DaApril 23, 2014 at 2:52 PM
Hello, Dr. Rubin,

Thanks for providing an example on populating solutions. I'm wondering if there is a way to check all feasible solutions that cplex encounters during its branch-and-cut procedure. To be specific, I don't want to disturb the default algorithms and paths used by CPLEX when it solves an MIP problem. I just want to check every feasible solution it encounters during its process so that I can keep some of them based on my own criteria. I checked the diversity filters, but it seems not helpful in my case because I would like the feasible solution kept in the pool to be all different.

Thank you very much,
Da
ReplyDelete
Replies
UnknownMay 27, 2014 at 4:46 AM
Hi Paul, i have a question in an other context in cplex. I want to express a constraint for that two BoolVarArray whose must be differents, these arrays are the solutions. so i search a constraint like allDiff in gecode. if you have an idea for how can i express it, thank you.
ReplyDelete
Replies
AnonymousMay 27, 2014 at 11:31 AM
Dear Dr. Rubin,

Is it possible to generate all possible solutions for an MIP problem, by applying 'populate' to a subset of integer variables? I mean, for instance, I have 30 integer variables, and I only want to populate 15 of them, regardless of the values for the others?

Thank you

Mert
ReplyDelete
Replies
AnonymousAugust 1, 2014 at 1:45 PM
Dr Rubin, in CPLEX output, How to make sense of absolute gap, relative gap, optimality gap? if I get optimality gap >0.4% is it still a good solution?
ReplyDelete
Replies
AnonymousNovember 5, 2015 at 10:54 PM
Hi, Dr. Rubin

Thank you so much for the post. I am using CPLEX to solve a ILP with a sparse matrix containing 19000 variances and 13000 constraints. Each variance is restricted to {-1,0,1}. CPLEX can quickly get the optimal value. But I am interested in what are the variances that can be fixed under the optimal value, let's say, a variance is assigned to 1 in >80% solutions. Do you think is it possible to "populate()" them? Or is there any other way can probably do this?

Best,
Ian
ReplyDelete
Replies
UnknownMay 6, 2016 at 1:36 PM
Hi Dr. Rubin,

Thanks for your help about the Solution Pool. I'm actually trying to solve a medium size MIP, but I need to obtain the time elapsed until the first feasible solution was found... also I need the time until the best feasible solution was found.

I haven't found something in CPLEX documentation that can help me to do this, however I know this data is commonly used in research...

Do you know what would be the best way to do this?

Best,
Javier
ReplyDelete
Replies
AnonymousMay 12, 2016 at 8:58 AM
Hi
I am using IBM ILOG CPLEX optimization studio studio. Thanks to your useful hints I am able to see my solution pool in the left-down part of the software window. Know in order to see each solution I should choose "pool solution #.." .
1.I wanted to know is there a way to display or extract all solutions in the solution pool in the same time?
2.My model contains continious and integer variables. Does solution pool shows different solutions respect to continious variable as well?
ReplyDelete
Replies
ZoubeirNovember 28, 2016 at 4:51 PM
Thank for this helpful post. Can I call populate in integer (binary) programming problems with CPLEX? I tried to do it but I get an error message like 'CPLEX Error 3003: Not a mixed-integer problem.'
ReplyDelete
Replies
UnknownJuly 2, 2017 at 7:11 AM
I am using the CPLEX solver for binary integer linear programming (cplexbilp) in MATLAB and would like to print out the identified alternative solutions in the solution pool. The code looks as follows:

options = cplexoptimset('cplex');
options.Display = 'on';
options.solnpoolagap = 0;
>> options.solnpoolintensity = 4;
>> options.populatelim = 100;
[x, fval, exitflag, output] = cplexbilp (f, Aineq, bineq, Aeq, beq, ...
[ ], options);
Each time only one solution is stored in x, but I know as CPLEX tells me that there are other solutions as well.

Do you know how to print out those alternative solutions?
ReplyDelete
Replies
AnonymousNovember 19, 2018 at 12:18 PM
Hi Dr. Rubin,
I am trying to change the parameter of solution capacity between successive calls of populate. But it seems only the capacity of the first populate is keeping (for example i populate 100 times, and the first capacity is set to 10. and then i set the capacity to 20 and call populate again. ) . Can i change the capacity parameter when calling multiple time of populate？Thank you
ReplyDelete
Replies
QingweiNovember 20, 2018 at 1:38 PM
Dear pro. Rubin:
Thank you so much! Yes, i set the replacement strategy to 1. You answer really helped me. Thank you again!
Best regards,
Qingwei
ReplyDelete
Replies
Paul A. RubinNovember 20, 2018 at 2:12 PM
You are very welcome. Glad it helped.
ReplyDelete
Replies
AnonymousFebruary 22, 2019 at 12:32 PM
Hi Dr. Rubin,
Thank you very much for this post and your blog. First of all, I am quite newbie in OR. I have a partitioning problem where I need to find all optimal solutions. By following your post and the IBM manual (I set 'SolnPoolAGap' to 0.5), I was able to find many optimal solutions. Based on this, I would like to ask 2 questions:
1) Can we say for sure that we get all optimal solutions?
2) I ran into some memory problem for some instances, please note that the memory (RAM) I use is quite large: 256 gb. So, I am thinking an alternative way which consumes less memory when enumerating all optimal solutions. I have two ideas (but, I do not know if it is feasible):
*) Using cutting plane approach. Hence, we do not add all triangle constraints, it might help us gain memory usage. However, I am not sure how to implement this. If I first generate user cuts within X minutes (Branch&Cut), and then I solve the same instance with generated tight user cuts (Branch&Bound), this would be a good idea?
*) I can (efficiently) solve the problem for one optimal solution, and get the objective function value. Then, use this objective function value as constraint, and solve the same instance again but for all optimal solution.

I am open to all your suggestions. Thanks in advance.

Nejat
ReplyDelete
Replies

Add comment

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.