If I recall correctly, the original paper by Jack Benders [1] dealt with continuous nonlinear optimization, not discrete linear optimization, and I do not know who first made the connection to MILP models. The application I have in mind here deals with problems that, without loss of generality, can be modeled as follows:
\[ \begin{array}{lrclcc} \textrm{minimize} & c_{1}'x & + & c_{2}'y\\ \textrm{subject to} & Ax & & & \ge & a\\ & & & By & \ge & b\\ & Dx & + & Ey & \ge & d\\ & x & \in & \mathbb{Z}^{m}\\ & y & \in & \mathbb{R}^{n} \end{array} \]
Benders decomposition splits this into a master problem and a subproblem. The subproblem must be a linear program (LP), so all the discrete variables have to go into the master problem (a MILP). You can optionally keep some of the continuous variables in the master, but I will not do so here. You can also optionally create multiple subproblems (with disjoint subsets of $y$), but again I will not do so here. Finally, in the spirit of keeping things simple (and because unbounded models are usually a sign of a lazy modeler), I'll assume that the original problem is bounded (if feasible), which implies that both the master and subproblem will be bounded (if feasible).
The master problem is as follows:
\[ \begin{array}{lrclccc} \textrm{minimize} & c_{1}'x & + & z\\ \textrm{subject to} & Ax & & & \ge & a\\ & g'x & + & z & \ge & f & \forall (g,f)\in\mathcal{O}\\ & g'x & & & \ge & f & \forall (g,f)\in\mathcal{F}\\ & x & \in & \mathbb{Z}^{m}\\ & z & \in & \mathbb{R} \end{array}. \]Variable $z$ acts as a surrogate for the objective contribution $c_{2}'y$ of the continuous variables. Set $\mathcal{O}$ contains what are sometimes known as “optimality cuts”: cuts that correct underestimation of $c_{2}'y$ by $z$. $\mathcal{F}$ contains what are sometimes known as “feasibility cuts”: cuts that eliminate solutions $x$ for which the subproblem is infeasible. Initially $\mathcal{O}=\emptyset=\mathcal{F}$. The subproblem, for a given $x$ feasible in the master problem, is: \[ \begin{array}{lrcl} \textrm{minimize} & c_{2}'y\\ \textrm{subject to} & By & \ge & b\\ & Ey & \ge & d-Dx\\ & y & \in & \mathbb{R}^{n} \end{array} \]Optimality cuts are generated using the dual solution to a feasible subproblem; feasibility cuts are generated using an extreme ray of the dual of an infeasible subproblem. The precise details are unnecessary for this post.
Back in the '60s ('70s, '80s), solvers were essentially closed boxes: you inserted a problem, set some parameters, started them and went off to read the collected works of Jane Austen. Intervening in the algorithms was not an option. Thus the original flowchart for Benders decomposition looked like the following.
Obviously the modern approach can only be implemented if you are using a solver that supports callbacks, and requires more advanced programming than the classical approach. Other than that, what are the advantages and disadvantages?
The main advantage of the callback approach is that it is likely to avoid considerable rework. In the original approach, each time you add a cut to the master problem, you have to solve it anew. Although the new cuts may change the structure of the solution tree (by changing the solver's branching decisions), you are probably going to spend time revisiting candidate solutions that you had already eliminated earlier. Moreover, you may actually encounter the optimal solution to the original problem and then discard it, because a superoptimal solution that is either infeasible in the original problem or has an artificially superior value of $z$ causes the true optimum to appear suboptimal. With the callback approach, you use a single search tree, never revisit a node, and never overlook a truly superior solution.
The potential disadvantage of the callback approach is that you potentially generate a cut each time a new incumbent is found, and some of those cuts quite possibly would have been avoided if you followed the classical approach (and added a cut only when an “optimal” solution was found). Modern solvers are good at carrying “lazy” constraints in a pool, rather than adding the cuts to the active constraint set, but there is still a chance that the modern approach will take longer than the classical approach. I believe (but have no data to support my belief) that the modern approach will typically prove the faster of the two.
[1] Benders, J. F., Partitioning Procedures for Solving Mixed-Variables Programming Problems, Numerische Mathematik 4 (1962), 238--252.


Paul: not re-building the master search tree every time is definitely the way to go. One of the first papers to advocate this approach was the branch-and-check paper by Erlendur Thorsteinsson. You can download it from this page: http://www.mmedia.is/esth/papers/. It's the one that appeared in the CP 2001 proceedings.
ReplyDelete@Tallys: Thanks for the link (and the affirmation). IPs being IPs (I think "IP" stands for "Intentionally Perverse"), I'm sure that there will be occasional problems where rebuilding the tree ends up faster, but I agree that those are likely to be the exceptions.
ReplyDeleteProbably not the most relevant comment but how do you type in the formulas in blogspot. I mean how do you get them to show so nicely.
ReplyDelete@Erling: I'm writing the formulas in LaTeX and using MathJax to render them. In Blogger's dashboard, I went to Settings > Posts and Comments and pasted the following into the Post Template box:
ReplyDelete<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
<</script>
<script type="text/javascript"
src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>
@Erling: I should add that when I write a post, I need to switch into HTML mode initially to position the cursor after the script. Once I've started typing, I can switch back to composition (rich text) mode.
ReplyDeleteDear Professor Rubin,
ReplyDeleteFor the callback diagram, in the callback (incumbent) loop, what do you mean by "Accept x_hat, z_hat"? I think z_hat is not well defined when we use callback, sometimes, we find that z_hat is even lower than c2'y.
The arrow leading into that box comes from a diamond designating the result of solving the subproblem. This particular arrow represents the case where z-hat matches c_2'y.
ReplyDeleteDear Professor Rubin,
ReplyDeleteThank you for your explanation.
What I mean is "Is it really correct to have that arrow?".
In the branching tree, is it possible to have c_2'y_hat < z_hat? From our implementation of Benders decomposition with callbacks, it is possible.
Do you have any stopping criteria for the incumbent loop?
Thank you very much.
@NVA: Yes, it is possible that c_2'y_hat < z_hat; that is the arrow downward from the diamond.
ReplyDeleteThe incumbent loop is not a closed loop; the arrow moving up and left from the circular node goes to the "Solve master" box; it does not connect directly to the "incumbent" arrow oriented up and right. After processing the most recent incumbent, the solver regains control and either accepts it or adds a cut. It is possible that the incumbent callback will be called again at the same master problem node. Obviously, the callbacks stop when the solver decides it is done with the master problem.
Dear Professor Robin,
ReplyDeleteIf the constraint Dx + Ey >= d;
was not there, would it still be possible to apply the benders decomposition to it?
If Yes, then how would you generate the benders cuts?
If the connecting constraint were not there, you would have no need for Benders; the problem would separate into an IP involving only x and an LP involving only y. Solve them independently and glue the solutions together.
ReplyDeleteThank You Professor for your explanation.
ReplyDelete"Moreover, ...because a superoptimal solution that is either infeasible in the original problem or has an artificially superior value of z causes the true optimum to appear suboptimal."
ReplyDeleteProf. Rubin
I don't quite follow the meaning of this sentence. Could you please explain more about it? Thanks a lot!
@C.L.: I'll try. Suppose that we are doing Benders the old fashioned way (solve the master to optimality, add a cut, repeat), and suppose that the optimal solution to the current master problem is destined not the optimal solution to the original problem. Then the solution to the current master must be superoptimal (too good to be true) in the original problem (because the optimal solution to the original problem is feasible in the current master). So the solution we are destined to get when we finish solving the current master will prove either to be infeasible in the original problem (and we will add a feasibility cut to cut it off) or we will discover that the value of $z$ in that solution underestimates (if we are minimizing) the actual contribution of the terms for which it is surrogate ($c_2'y$).
ReplyDeleteAs we are solving the current master, it is possible that at some node we will actually encounter the solution that is optimal in the original problem. We know it is feasible in this master, so it lurks somewhere in the search tree -- in a node that is destined to be pruned, because we are destined to find this superoptimal solution that we will subsequently reject. My point was that if we can reject the superoptimal solution as soon as we encounter it (using callbacks), then it will not have a chance to prune the node with the real optimal solution, and we will encounter the real optimum sooner.
Prof.Rubin
DeleteYour elaborate reply resolve my question quite clear, thank you very much!
So, do you agree that the use of callbacks in implementing Benders' decomposition is accepted as standard by most of researchers in OR field?
C.L
I'm hesitant to say "most", since I have not surveyed people in the field. I think it's safe to say that using callbacks is common practice among people working actively with Benders decomposition.
Delete