OR in an OB World: Benders Decomposition with Integer Subproblems

Wednesday, July 31, 2013

Benders Decomposition with Integer Subproblems

I'm not sure why, but occasionally people post questions related to an attempt to apply Benders decomposition in a situation where the subproblem contains integer variables. A key question is how you generate cuts from the subproblem, since discrete problems do not enjoy the same duality theory that continuous problems do.

A typical application of Benders decomposition to integer programming starts with a problem of the form\[ \begin{array}{lrclcc} \textrm{minimize} & c_{1}'x & + & c_{2}'y\\ \textrm{subject to} & Ax & & & \ge & a\\ & & & By & \ge & b\\ & Dx & + & Ey & \ge & d\\ & x & \in & \mathbb{Z}^{m}_+\\ & y & \in & \mathbb{R}^{n}_+ \end{array} \]This decomposes into a master problem\[ \begin{array}{lrclcc} \textrm{minimize} & c_{1}'x & + & z\\ \textrm{subject to} & Ax & & & \ge & a\\ & h'x & & & \ge & h_0 & \forall (h,h_0)\in \mathcal{F}\\ & h'x & + & z & \ge & h_0 & \forall (h, h_0)\in \mathcal{O}\\ & x & \in & \mathbb{Z}^{m}_+ \\ & z & \ge & 0 \end{array} \]and a subproblem\[ \begin{array}{lrclcc} \textrm{minimize} & c_{2}'y\\ \textrm{subject to} & By & \ge & b\\ & Ey & \ge & d - Dx\\ & y & \in & \mathbb{R}^{n}_+ \end{array} \]where $\mathcal{F}$ and $\mathcal{O}$ are sets of coefficient vectors for "feasibility" cuts (pushing $x$ in directions that make the solution $(x,y)$ feasible) and "optimality" cuts (pushing $z$ upward so as not to underestimate $c_2'y$) respectively. The subproblem is a linear program, and its dual solution supplies the coefficient vectors $(h,h_0)$ for both types of master cuts.

So what happens if $y$ is integer-valued ($y\in\mathbb{Z}^n_+$) rather than continuous ($y\in\mathbb{R}^n_+$)? I don't have a definitive answer, but there are a few things that can be tried. The following suggestions should also work equally well (or poorly) when $y$ is a mix of integer and continuous variables.

Proceed as usual

The subproblem is now an integer program, but you can always relax it to a linear program and obtain the dual solution to the relaxation. If the current master solution $x = \hat{x}$, $z = \hat{z}$ makes the linear relaxation of the subproblem infeasible, you can be sure it also makes the actual subproblem infeasible, and thus you will get a legitimate feasibility cut. If the subproblem is feasible, the dual to the relaxation will produce a cut that forces $z$ to be at least as great as the objective value of the relaxation, which is a legitimate lower bound for the actual subproblem objective value.

The news here is not all good, though. It is possible that $\hat{x}$ makes the subproblem integer-infeasible but with a feasible relaxation, in which case you will not get the feasibility cut you need. If the subproblem is feasible (let's say with optimal solution $\hat{y}$) but $\hat{z}$ underestimates its objective value $c_2'\hat{y}$, you want an optimality cut that forces $z\ge c_2'\hat{y}$ when $x=\hat{x}$; but the cut you get forces $z\ge w$ where $w$ is a lower bound for $c_2'\hat{y}$, and so there is the possibility that $c_2'\hat{y} > \hat{z} \ge w$ and the optimality cut accomplishes nothing.

"No good" constraints for infeasibility

Suppose that $x$ consists exclusively of binary variables. (General integer variables can always be converted to binary variables, although it's not clear that the conversion is in general desirable.) We can exclude a particular solution $x=\hat{x}$ with a "no good" constraint that forces at least one of the variables to change value:\[\sum_{i : \hat{x}_i=0} x_i + \sum_{i : \hat{x}_i = 1} (1-x_i)\ge 1.\]This gives us another option for feasibility cuts. Solve the subproblem as an IP (without relaxation); if the subproblem is infeasible, add a "no good" cut to the master problem. Note that "no good" cuts are generally not as deep as regular Benders feasibility cuts -- the latter may cut off multiple integer solutions to the master problem, whereas a "no good" cut only cuts off a single solution.

If a "no good" cut eliminates just one solution, is it worth the bother? After all, the node that produced $x=\hat{x}$ will be pruned once we realize the subproblem is infeasible. The answer depends on a combination of factors (and essentially reduces to "try it and see"). First, if $x=\hat{x}$ was produced by a heuristic, rather than as the integer-feasible solution to the node LP problem, then you likely cannot prune the current node (and, in fact, the node you would want to prune may be elsewhere in the tree). Adding the "no good" cut may prevent your ever visiting that node, and at minimum will result in the node being pruned as soon as you visit it, without having to solve the subproblem there. Second, if your master problem suffers from symmetry, the same solution $x=\hat{x}$ may lurk in more than one part of the search tree. The "no good" cut prevents your tripping over it multiple times.

It may be possible to strengthen the cut a bit. Suppose that $\hat{x}$ renders the subproblem infeasible (as an IP). There are various ways to identify a subset of the subproblem (excluding the objective function) that causes infeasibility. CPLEX can do this with its conflict refiner; other solvers may have similar functionality. Let $N$ be the set of indices of the $x$ variables and $N_0$ the set of indices of all $x$ variables that appear in the right hand side of at least one subproblem constraint identified as part of the conflict. If we are lucky, $N_0$ is a proper subset of $N$. We can form a "no good" cut for the master problem using just the variables $x_i, i\in N_0$, rather than all the $x$ variables, and obtain a somewhat deeper cut (one that potentially cuts off multiple master problem solutions). The caveat here is that running something like the CPLEX conflict refiner, after determining that the subproblem is infeasible, may eat up a fair bit of CPU time for questionable reward.

"No good" constraints for optimality

It may be possible to exploit the technique I just described to create ersatz optimality constraints as well. Suppose that the current incumbent solution is $(\tilde{x}, \tilde{y})$, and that some node gives an integer-feasible solution $(\hat{x},\hat{z})$ for the master problem. It must be the case that\[c_1'\hat{x}+\hat{z}<c_1'\tilde{x}+c_2'\tilde{y},\]else the master problem node would be pruned based on bound. Now suppose we pass $\hat{x}$ to the IP subproblem and obtain an optimal solution $y=\hat{y}$. If $c_1'\hat{x}+c_2'\hat{y}<c_1'\tilde{x}+c_2'\tilde{y}$, we have a new incumbent solution. If not, then $x=\hat{x}$ cannot lead to an improved solution, and we can add a "no good" cut to eliminate it (again recognizing that this is a weak constraint).

More?

That pretty much exhausts my quiver. If any readers have other ideas for generating Benders cuts from integer subproblems, I invite you to post them in comments.

38 comments:

ShivaAugust 1, 2013 at 12:38 AM
I recall using something akin to the 'inference duals' in logic based benders decomp by jn hooker.
ReplyDelete
Replies
kalmarAugust 1, 2013 at 2:28 AM
For the case of binary only first-stage variables, you could use Laporte-Louveaux Cuts. All you need is a finite lower bound for the problem.
http://www.sciencedirect.com/science/article/pii/016763779390002X
ReplyDelete
Replies
MikeAugust 1, 2013 at 5:26 AM
The idea of "logical" or "combinatorial" benders is pretty popular right now (for a suitable definition of popular). John Hooker started it off here but others (including me). I have a talk on this. Generally this work exploits the structure of the subproblems. Codato and Fischetti explored the general integer program case. Lots more to do in this area!
ReplyDelete
Replies
DOFPAugust 9, 2013 at 2:15 PM
The real feat of no-good cuts is being able to handle side constraints in Benders decompositions.

Once you got to a point where you cannot generate any other Bender cut in your sub-problem, there are only two possible cases
1. The Benders cuts as a whole are necessary and sufficient, in which case you are done : no more Benders cuts is equivalent to master feasibility
2. The Benders cuts as a whole are necessary but not sufficient : you don't know how to generate more cuts, but you are not done yet
[3. The Benders cuts are not necessary : you did it all wrong, you are not even solving the right problem...]

Finding Benders cuts that are necessary is "easy". You just need to start from the original problem and sum and replace variables / constraints till you get to the cut. If all the steps are correct, then your cut is a valid Benders cut.

Finding a group of Benders cuts that as a whole is a sufficient condition is a serious problem
- lots of maths : remember your classics like the network design problem with bi-partition cuts and then the generalized flow cuts computed in a euclidean graph derived form the shortest paths in the network, or the max-cut where you have to find odd-cycle constraints on a graph where some specific minors have been contracted...
- it doesn't handle side constraints : even if you did do all the maths, you add a constraint to the original problem and the sufficient condition is broken !

That's were no-good cuts enter the scene. They save you from the issue 2. and allow you to do a Benders decomposition with only necessary cuts while still retaining global optimality. The reason is that by themselves, no-good cuts are sufficient as a whole because they basically are equivalent to "generate and test" or ordered enumeration. Thereafter whatever my Benders cuts are + no-good is necessary and sufficient.
ReplyDelete
Replies
DOFPAugust 10, 2013 at 12:02 PM
Right... but that wasn't really my point, I kind of got sidetracked. My point was in a real setting you need to reinject the solutions found via a heuristic callback.

Lets take your transportation example of Benders decomposition (in one of your posts) and imagine that in the first iteration, two heuristics find the following solutions
- y = [1, 1, ..., 1] and z = 0
- y = [1, 1, ..., 1] and z = 100

Then you compute in the sub-problem a min-cost flow and find the minimum cost flow is 10. At this point you have in your hands a full solution of the problem ([1, 1, ..., 1], 10)

If we follow the algorithm description in your post, in the first case the solution ([1, 1, ... 1], 100) will be kept while the other solution will generate an objective cut, and hopefully in a subsequent iteration, the engine might find ([1, 1, ..., 1], 10) thanks to the objective cut but without guarantee because it might have "jumped" to another more promising branch. Worse, if your sub-problem is an IP, there will be a duality gap between the primal and the dual, so we are not even sure the optimiality cut will "touch" the solution.

At the end, you literally had in your hands a solution for the full problem and didn't give it to the engine. Instead you generated an upper and a lower bound !

In a real world benders decomposition you will need a heuristic callback to "inject your corrected solutions". That will help the MIP engine improve its branching, pruning, and save a couple of iterations. Also, your user / customer will appreciate the early solutions as the "dual ascending" property of the traditional Benders is terribly annoying (the fact you don't have any solution for hours and suddenly you are given the optimal one).
ReplyDelete
Replies
DOFPAugust 11, 2013 at 5:38 AM
Hum... lets flatten those master optimality and feasibility constraints

min z + sum c_j y_j
subject to
z + y1 + y4 + y5 >= 2 // optimality
z + y3 + y5 + y6 >= 2 // optimality
y1 + y5 >= 1 // feasibility
y4 + y6 >= 3 // feasibility
y_j in {0,1}, z in R+

That's a cover problem (actually a general mixed integer multi-knapsack, which is why there is so much literature about separating MIKPs). There is no reason any heuristic should always find the minimum value for Z. What if the heuristic started fixing Z and "propagated" to the other variables ?

But by far the worst part is the lower bound (when Z in the master is lower than the real Z).
You started with a master solution (Y, 0), found out that the real solution was (Y, Z) and added a cut. However, there is no guarantee in the IP case that the cut will actually even cut anything between (Y, 0) and (Y, Z). So in the next iterations you could find (Y, 1), (Y, 2), ... and have a "slow convergence" behavior, similar to no-good cuts.

That's actually the meaning of "Benders cuts that are necessary but not sufficient". In the IP case, there is a duality gap between the primal (network flow with integer capacity) and the dual used here (max-cut). Thereafter your dual cuts don't always "touch" your primal solutions, even the ones used to generate them. Therefore they may fail to make the master feasible, which was exactly case 2. in the initial discussion about no-good cuts.

That's why if you don't inject the solutions in an IP benders, you exhibit dual-ascending behavior, just like traditional Benders.
ReplyDelete
Replies
VerittaasOctober 13, 2013 at 4:30 AM
Hi Paul,

We have a network flow problem with integrality property. Where in, the optimal objective of the linear relaxation (with fractional solutions) is same as the optimal objective of the integer solution. In such cases, do you think we could have integer sub-problems and hope to exploit the LP duality?

Regards,
Vivek
ReplyDelete
Replies
Roberto MoreiraNovember 6, 2014 at 10:53 AM
Hi Paul,

While "googloing" for a solution to my problem with benders decomposition i stumbled on your blog which relates to what am facing.
I have a 2 stage stochastic benders decomposition with big-M (disjunctive) constraints on my subproblem. Neither of my objectives (master or subproblem) have binary/integer constraints.
I implemented the benders optimality cuts the usual way but in some situations my lower bound overshoots my upper bound and the problem will never convergence.

Have you ever faced something similar?

On another note, congratulations for the great blog.

Best,
Roberto
ReplyDelete
Replies
UnknownApril 11, 2016 at 11:18 AM
Hi Paul,
Thank you very much for your great blog. It really helps a lot for my research work.
When I google Benders decomposition with binary variable in the subproblem, I found this useful post.
For my problem, the binary variables introduced in the subproblem are actually dummy variables which are not included in the objective function (They are not the complicating variables either). The physical meanings of the dummy variables are the power flow direction on the transmission lines.

I add penalty term to the subproblem so the subproblem is always feasible. What I did to extract duals from the subproblem is quite straightforward:

1. Solve the original MILP model. This will give you the optimal values for the dummy variable (flow direction).
2. Fix the value of the dummy variables to the values obtained from step 1, the problem becomes to a pure linear programming problem. Then I extract the corresponding duals and form the corresponding optimality cut.

I know that the duality gap in the MILP may prevent the implementation of the benders decomposition. But this method works in my problem. It will take a reasonable time for the solution to converge to the tolerance less than 0.5% for a large scale problem.

Could you give some comments on the approach I used? Are the Benders cuts obtained from this approach valid? I mean the benders cut from this approach would cause suboptimal or cut out some feasible region for the original problem?
Thank you very much for your time.
ReplyDelete
Replies
UnknownApril 20, 2016 at 2:19 PM
Thank you very much for your time! I guess I understand what you are trying to explain.

I think I get some simulation results to justify my approach (It will definitely be suboptimal but the results are very close to the global optimum in terms of the objective value).

What I did is to compare the results from my approach with the results obtained when relaxing all the flow direction variables in the subproblem to the continuous variables. I think the result from the (master problem+relax subproblem) provides the lower bound for the original problem (Actually the result is not a feasible result because the flow direction can take fractional value).

The result obtained by using my approach is very close to the lower bound. So I guess it is useful from engineering point of view.

Thank you again for your help!

ReplyDelete
Replies
JorisNovember 7, 2016 at 12:50 AM
"General integer variables can always be converted to binary variables, although it's not clear that the conversion is in general desirable."
Hi Paul, do you remember what you meant by this? I.e. how to convert a general integer variable to a binary variable? This seems non-trivial? E.g. expressing a benders cut over binary variables is easy, but doing the same for general integer variables seems much harder?
ReplyDelete
Replies

Add comment

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.