The original paper by Jack Benders [1] dealt with both continuous and discrete optimization, linear and nonlinear. The application I have in mind here deals with problems that, without loss of generality, can be modeled as follows:

\[ \begin{array}{lrclcc} \textrm{minimize} & c_{1}'x & + & c_{2}'y\\ \textrm{subject to} & Ax & & & \ge & a\\ & & & By & \ge & b\\ & Dx & + & Ey & \ge & d\\ & x & \in & \mathbb{Z}^{m}\\ & y & \in & \mathbb{R}^{n} \end{array} \]

Benders decomposition splits this into a master problem and a subproblem. The subproblem must be a linear program (LP), so all the discrete variables have to go into the master problem (a MILP). You can optionally keep some of the continuous variables in the master, but I will not do so here. You can also optionally create multiple subproblems (with disjoint subsets of $y$), but again I will not do so here. Finally, in the spirit of keeping things simple (and because unbounded models are usually a sign of a lazy modeler), I'll assume that the original problem is bounded (if feasible), which implies that both the master and subproblem will be bounded (if feasible).

The master problem is as follows:

\[ \begin{array}{lrclccc} \textrm{minimize} & c_{1}'x & + & z\\ \textrm{subject to} & Ax & & & \ge & a\\ & g'x & + & z & \ge & f & \forall (g,f)\in\mathcal{O}\\ & g'x & & & \ge & f & \forall (g,f)\in\mathcal{F}\\ & x & \in & \mathbb{Z}^{m}\\ & z & \in & \mathbb{R} \end{array}. \]Variable $z$ acts as a surrogate for the objective contribution $c_{2}'y$ of the continuous variables. Set $\mathcal{O}$ contains what are sometimes known as “optimality cuts”: cuts that correct underestimation of $c_{2}'y$ by $z$. $\mathcal{F}$ contains what are sometimes known as “feasibility cuts”: cuts that eliminate solutions $x$ for which the subproblem is infeasible. Initially $\mathcal{O}=\emptyset=\mathcal{F}$. The subproblem, for a given $x$ feasible in the master problem, is: \[ \begin{array}{lrcl} \textrm{minimize} & c_{2}'y\\ \textrm{subject to} & By & \ge & b\\ & Ey & \ge & d-Dx\\ & y & \in & \mathbb{R}^{n} \end{array} \]Optimality cuts are generated using the dual solution to a feasible subproblem; feasibility cuts are generated using an extreme ray of the dual of an infeasible subproblem. The precise details are unnecessary for this post.

Back in the '60s ('70s, '80s), solvers were essentially closed boxes: you inserted a problem, set some parameters, started them and went off to read the collected works of Jane Austen. Intervening in the algorithms was not an option. Thus the original flowchart for Benders decomposition looked like the following.

Obviously the modern approach can only be implemented if you are using a solver that supports callbacks, and requires more advanced programming than the classical approach. Other than that, what are the advantages and disadvantages?

The main advantage of the callback approach is that it is likely to avoid considerable rework. In the original approach, each time you add a cut to the master problem, you have to solve it anew. Although the new cuts may change the structure of the solution tree (by changing the solver's branching decisions), you are probably going to spend time revisiting candidate solutions that you had already eliminated earlier. Moreover, you may actually encounter the optimal solution to the original problem and then discard it, because a superoptimal solution that is either infeasible in the original problem or has an artificially superior value of $z$ causes the true optimum to appear suboptimal. With the callback approach, you use a single search tree, never revisit a node, and never overlook a truly superior solution.

The potential disadvantage of the callback approach is that you potentially generate a cut each time a new incumbent is found, and some of those cuts quite possibly would have been avoided if you followed the classical approach (and added a cut only when an “optimal” solution was found). Modern solvers are good at carrying “lazy” constraints in a pool, rather than adding the cuts to the active constraint set, but there is still a chance that the modern approach will take longer than the classical approach. I believe (but have no data to support my belief) that the modern approach will typically prove the faster of the two.

[1] Benders, J. F.,

*Partitioning Procedures for Solving Mixed-Variables Programming Problems*,

**Numerische Mathematik**4 (1962), 238--252.

Paul: not re-building the master search tree every time is definitely the way to go. One of the first papers to advocate this approach was the branch-and-check paper by Erlendur Thorsteinsson. You can download it from this page: http://www.mmedia.is/esth/papers/. It's the one that appeared in the CP 2001 proceedings.

ReplyDelete@Tallys: Thanks for the link (and the affirmation). IPs being IPs (I think "IP" stands for "Intentionally Perverse"), I'm sure that there will be occasional problems where rebuilding the tree ends up faster, but I agree that those are likely to be the exceptions.

ReplyDeleteProbably not the most relevant comment but how do you type in the formulas in blogspot. I mean how do you get them to show so nicely.

ReplyDelete@Erling: I'm writing the formulas in LaTeX and using MathJax to render them. In Blogger's dashboard, I went to Settings > Posts and Comments and pasted the following into the Post Template box:

ReplyDelete<script type="text/x-mathjax-config">

MathJax.Hub.Config({

tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}

});

<</script>

<script type="text/javascript"

src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">

</script>

@Erling: I should add that when I write a post, I need to switch into HTML mode initially to position the cursor after the script. Once I've started typing, I can switch back to composition (rich text) mode.

ReplyDeleteDear Professor Rubin,

ReplyDeleteFor the callback diagram, in the callback (incumbent) loop, what do you mean by "Accept x_hat, z_hat"? I think z_hat is not well defined when we use callback, sometimes, we find that z_hat is even lower than c2'y.

The arrow leading into that box comes from a diamond designating the result of solving the subproblem. This particular arrow represents the case where z-hat matches c_2'y.

ReplyDeleteDear Professor Rubin,

ReplyDeleteThank you for your explanation.

What I mean is "Is it really correct to have that arrow?".

In the branching tree, is it possible to have c_2'y_hat < z_hat? From our implementation of Benders decomposition with callbacks, it is possible.

Do you have any stopping criteria for the incumbent loop?

Thank you very much.

@NVA: Yes, it is possible that c_2'y_hat < z_hat; that is the arrow downward from the diamond.

ReplyDeleteThe incumbent loop is not a closed loop; the arrow moving up and left from the circular node goes to the "Solve master" box; it does not connect directly to the "incumbent" arrow oriented up and right. After processing the most recent incumbent, the solver regains control and either accepts it or adds a cut. It is possible that the incumbent callback will be called again at the same master problem node. Obviously, the callbacks stop when the solver decides it is done with the master problem.

Dear Professor Robin,

ReplyDeleteIf the constraint Dx + Ey >= d;

was not there, would it still be possible to apply the benders decomposition to it?

If Yes, then how would you generate the benders cuts?

If the connecting constraint were not there, you would have no need for Benders; the problem would separate into an IP involving only x and an LP involving only y. Solve them independently and glue the solutions together.

ReplyDeleteThank You Professor for your explanation.

ReplyDelete"Moreover, ...because a superoptimal solution that is either infeasible in the original problem or has an artificially superior value of z causes the true optimum to appear suboptimal."

ReplyDeleteProf. Rubin

I don't quite follow the meaning of this sentence. Could you please explain more about it? Thanks a lot!

@C.L.: I'll try. Suppose that we are doing Benders the old fashioned way (solve the master to optimality, add a cut, repeat), and suppose that the optimal solution to the current master problem is destined not the optimal solution to the original problem. Then the solution to the current master must be superoptimal (too good to be true) in the original problem (because the optimal solution to the original problem is feasible in the current master). So the solution we are destined to get when we finish solving the current master will prove either to be infeasible in the original problem (and we will add a feasibility cut to cut it off) or we will discover that the value of $z$ in that solution underestimates (if we are minimizing) the actual contribution of the terms for which it is surrogate ($c_2'y$).

ReplyDeleteAs we are solving the current master, it is possible that at some node we will actually encounter the solution that is optimal in the original problem. We know it is feasible in this master, so it lurks somewhere in the search tree -- in a node that is destined to be pruned, because we are destined to find this superoptimal solution that we will subsequently reject. My point was that if we can reject the superoptimal solution as soon as we encounter it (using callbacks), then it will not have a chance to prune the node with the real optimal solution, and we will encounter the real optimum sooner.

Prof.Rubin

DeleteYour elaborate reply resolve my question quite clear, thank you very much!

So, do you agree that the use of callbacks in implementing Benders' decomposition is accepted as standard by most of researchers in OR field?

C.L

I'm hesitant to say "most", since I have not surveyed people in the field. I think it's safe to say that using callbacks is common practice among people working actively with Benders decomposition.

DeleteHello again, Prof. Rubin

ReplyDeleteI was implementing a Benders’ decomposition algorithm using LazyconstraintCallback and recently I want to make some improvements to accelerate the speed of search of new incumbent of master problem. The master problem contains a three dimension binary variable and a continuous variable. I design a heuristic method to assign an integer value to the integer infeasible solution of node LP relaxation. And I plan to implement it using HeuristicCallback of CPLEX. However, it seems that the newly found incumbent cannot be passed to LazyconstraintCallback to test whether it should be rejected (and meanwhile a new optimality cut being added) or not. I tried two ways to realize the HeuristicCallback:

1)When a new incumbent of master problem is found, I use setBounds() method to modify both of upper bound and lower bound to be same as the incumbent value, then use solve() method to resolve the current node relaxation. It shows that the node relaxation is optimally solved and a better incumbent is produced. But this incumbent is seemed to be ignored and no LazyconstraintCallback routine is invoked.

2)After the above method failed, I use setSolution() method after the resolving node relaxation to inject the incumbent in to the model. Again, no LazyconstraintCallback routine is invoked such that the incumbent is accepted by CPLEX. But actually this incumbent should be rejected.

Have you encountered any situation like this? What do you think the problem is and how to fix it?

I am really eager to have your reply soon.

Thanks a lot.

You definitely need to call setSolution if you want the solution considered.

DeleteI verified that CPLEX apparently does not call a lazy constraint callback (LCC, to save typing) when a solution is injected by the user in a heuristic callback (HC). Frankly, this makes sense to me: CPLEX (reasonably) expects that a user will not inject a solution that should be infeasible.

If your only concern is that the infeasibility be detected and the solution excluded, I suggest that you pull that testing/cut generating logic out of the LCC and make it a separate function, called from both the LCC and the HC. The HC calls it before injecting the new solution, and skips the call to setSolution if the function says the new solution would be infeasible.

If you really want to include the cut that the LCC would have generated, had it seen your heuristic solution, you will need to (a) create a place in global memory to store the cut and (b) add a cut callback (CC) in addition to the other two callbacks. If the HC calls the cut function and gets back a cut (meaning the solution should be excluded, it stores the cut in the global space and skips setSolution. Whenever the CC is entered, it checks the storage area to see if there is a cut queued up. If so, it adds the cut (and clears the storage area). If not, the CC just exits without doing anything.

Prof.Rubin

DeleteThank you very much for your quick reply.

I indeed want to include the cut that LCC would have generated because the cut would possibly generate some better lower bound for a minimization problem . Unfortunately, your suggestion that add another cut callback might not be permitted by CPLEX since it does not allow two CC to be called in one model. CPLEX would simply call the latter CC via use() method and ignore the first CC. Moreover, in CPLEX v12.3, CutCallback cannot be used any more, only LazyConstraintCallback or UserCutCallback is allowed.

Regarding this circumstance, what would you suggest me to do to implement my strategy? Thanks again!

You are correct that CutCallback is no longer available. If I recall correctly, when the first refactored it into LazyConstraintCallback and UserCutCallback, you could still directly subclass CutCallback, but that's no longer true.

DeleteI'm looking further into this, because it may have implications beyond your case, but for right now I don't see any good way to add that cut, at least not globally. You could use a branch callback to create a single child node by adding the cut, which would apply it to all descendants of the node where your heuristic found the rejected solution; but that would not apply it to other branches of the tree.

It belatedly occurred to me that a partial solution would be to check the heuristic solution before injecting it and, assuming it was rejected, queue the cut to be added the next time the LCC is called. The logic of the LCC would be to (a) test the new incumbent and, if it was not feasible, (b) add a new cut to chop it off, then (c) add any queued cuts (and clear the queue). If the LCC adds a cut, it should be called again immediately to give it the chance to add additional cuts. (The repeat calls stop when either the LCC does not add a cut or it adds a cut that duplicates one it already added. I think.) This won't add the queued cut immediately, and it may never add it (if no new incumbents are found), but it might be better than nothing.

DeleteThanks a lot for your answer, I try it in your way and it works now.Though it cannot be guaranteeing that the cuts being added, just like you said, it is better than nothing! Thanks again, Prof. Rubin.

DeleteDear Prof. Rubin,

ReplyDeleteI have been trying Benders' Decomposition to a couple of difficult MIP problems for quite sometime with little success. I have also tried many of the refinements (e.g. McDaniel & Devine (1977); Geofrrion & Graves (1974); Magnanti & Wong (1981)), but haven't had success with any of them: All them perform, in terms of CPU times, much worse than the original MIP formulations. Recently, I read your blog on Modern implementation of Benders' Decomposition, but wasn't quite sure as to how to code it. Then, I found a code (ilobendersatsp.cpp for ATSP), which comes with Cplex 12.3. Before trying to adapt it for my problems, I wanted to see how much of improvement it gives over the classical version of Benders for ATSP. So, I modified ilobendersatsp.cpp to get the classical implementation of Benders, and applied both the versions (Classical and Modern) to several problem instances of ATSP (e.g. br17, ry48p, p43) available at TSPLIB (http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/ATSP.html). In all these instances, the classical implementation beats the modern implementation.

However, the difference in CPU times wasn't much since all these instances could be solved in a few seconds. So, I tried a larger problem (kro124) for which the modern implementation took a little less than 2 hours, while the classical implementation could solve the same in less than 10 minutes.

So, I have the following questions now:

1. Was this expected (i.e., the modern implement to perform worse than the classical implementation)? I must confess that although ilobendersatsp.cpp came with Cplex 12.3, I am running it on Cplex 12.1.

2. The modern implementation of Benders' was my last hope to solve all the open problems I have. But if that itself performs worse than the classical Benders, which in turn performs worse than the original MIP formulation, what else can I try?

Thanks and Regards,

Sachin

Sachin,

DeleteI just glanced at the Java version of the Benders ATSP example. I'm on CPLEX 12.4, so I'm not sure if my version is the same as yours, but I suspect so. It may not be safe to run it on CPLEX 12.1. I would need to examine the code more closely than I have time to do right now, but I'm pretty sure it relies on the fact that, in recent versions of CPLEX, lazy constraint callbacks are called whenever CPLEX finds a new incumbent by any means (including heuristics). That's a relatively recent change to CPLEX, and I'm not positive it was in effect back in version 12.1.

Assuming your results are correct, and not an artifact of the version you are using, would I expect that? No. MIPs being the annoyingly inconsistent buggers that they are, it's certainly possible -- anything is possible -- but I would expect that on most problems the callback approach would be faster. Although I'm not sure it's related to the difference between "classical" and "modern" on this problem, I'll also point out that the ATSP example is not entirely typical, in that the objective function contains no contribution from the LP variables.

Will Benders perform worse than solving the original MIP formulation outright? That depends on a number of factors, including the structure of the original problem, the size of the original problem, and recent sunspot activity. With any decomposition technique, there is a fair bit of overhead, so decomposing an "easy" problem will usually end up taking more time than solving it outright. At what point, if any, the decomposition starts to pay for itself is anybody's guess. (This, by the way, holds true for pretty much any "clever" modeling or algorithmic technique applied to MIPs.)

As to your second question, that is very much problem specific. If you want to send a description of your difficult MIP problems to me (rubin msu edu), I'll try to take a look at them and see if anything pops out. Right at the moment I'm oddly busy, but within a couple of weeks or so I expect to have some free time.

Dear Prof. Rubin,

DeleteThank you so much for your such a prompt response and offer to help. I will soon prepare brief documents on the problems and send them to you by e-mail.

One of them uses a Big M in the original formulation, due to which Bender's Decomposition produces extremely week cuts, and the iterations just run for ever without any improvement (you have talked about this problem in your recent paper in OR: Bai & Rubin (2009)). So, I am trying to use Combinatorial Benders (CB)cuts, as proposed by you in this paper, for this problem. However, CB cuts themselves seem to be very week, and would appreciate some help on this as well. I will send you the description shortly. However, in the meanwhile, could you please quickly answer this: The Master problem in the Benders' Decomposition of this problem is just a feasibility problem (no integer variables in the objective function). I am currently trying to solve it using the classical Benders using CB cuts. However, I am wondering if solving the problem using Modern Benders as opposed to Classical Benders (both using Combinatorial Benders Cuts) will make much difference? Classical Benders has the drawback that solving the Master problem to optimality between cuts has the potential to waste time. But in this problem, we are solving the Master problem only to feasibility (since objective function is 0) between cuts even in Classical Benders.

Coming back to our discussion on ATSP, the code "ilobendersatsp.cpp" allows you to specify whether you want to use (option 0) just lazy constraint callback or (option 1) usercut callback in addition to lazy constraint callback.

I have run the code using both options (0 and 1), and I can see some difference in the node tree the algorithm traverses as a result.

Thanks and Regards,

Sachin

Regarding option 0 and option 1, I've always used the equivalent of option 0 (wait for a new candidate incumbent before adding cuts), but I know from forums that some people use option 1 (aggressively seek cuts at every node, perhaps by rounding the node solution and using that in the subproblem, or perhaps by passing a fractional master solution to the subproblem). My guess is that, like everything else, sometimes option 0 is better and sometimes option 1 is better. My hesitation in using option 1 is that a lot of "drag", both because the user cut callback is called at every node (perhaps more than once) while the lazy constraint callback is called only when an incumbent is found, and because it leads to a lot more time spent solving the subproblem. If option 1 tightens the master quickly enough to prune nodes high in the tree, the drag may be more than compensated; but I have never felt confident enough in that to try it.

DeleteHello Sachin, I would like to talk about this work. I am trying to use Benders for ATSP and I am having dificults. Could you send me a email: michellimaldo@gmail.com

Deletethanks.

Dear Prof. Rubin,

ReplyDeleteCould callback technique be applied to lazy constraint (managed by users, not by CPLEX) to solve MIP problem? My intuition tells me that chances are it can speed up the solving but I'm not sure.

Thanks a lot.

Hoa

Hoa,

DeleteCallbacks certainly can add lazy constraints. Are you talking about a Benders decomposition or just solving a typical MIP model?

I'm not sure what you mean by "managed by users, not by CPLEX". Users can add a lazy constraint but cannot directly remove it (other than by stopping and restarting the solution process). You can use the "cut management" feature to tell CPLEX that it is free to delete the cut later if the cut is not having much effect, but it will still be up to CPLEX to decide when to enforce the cut and whether to purge it.

Thank you very much.

DeleteI'm talking about a typical MIP. What I mean is in the callback, I add a constraint as an normal constraint (i.e., a constraint that is NOT added to the lazy constraint pool of CPLEX). But now I think this might not be possible in CPLEX.

Hoa

Hoa: You cannot modify the model itself while the solver is running. You can add a "user cut" (which is different in some respects from a lazy constraint) in a callback. User cuts are typically used to tighten the LP relaxations. They can be global or local to a particular subtree. The key distinction from a lazy constraint is that user cuts must not cut off integer-feasible solutions.

DeleteI'm not positive, but I"m pretty certain that user cuts also go into a pool and are activated only when they would be violated.

Hello again prof Rubin,

ReplyDeleteIn the equations Dx + Ey >= d, can D contains some big M ? If yes, what do you think of the efficiency of Bender decomposition in this case?

Thank you for your reply.

Hoa

Yes, D can contain "big M" coefficients. In the general case, you will suffer the usual perils of loose bounds and possible numerical instability (about which I've written in other posts). In the special case where the "big M" coefficients multiply binary variables (one per constraint), the intent is usually to impose a constraint on the $y$ variables only when the corresponding $x$ variable is 0. In that case, you are better off literally adding or removing the constraint in the subproblem, rather than generating master cuts containing $M$. Trying searching the phrase "combinatorial Benders cuts"; you should find a reference to a nice paper by Codato and Fischetti on the subject.

DeleteHi professor Rubin,

ReplyDeleteI have used the logic-based Benders decomposition in my recent paper, and your posts on this topic have been very helpful.

Will you give me the permission to reproduce your Benders flow-charts in Beamer (LaTex) presentation? I need them to explain the method.

Thank you very much.

Sure, no problem. The source code for the diagrams is LaTeX using the TiKZ package, so it's very compatible with Beamer. If you write to me at rubin AT msu DOT edu, I'll send you the TiKZ code.

DeleteHello Dr Rubin,

ReplyDeleteI am experiencing a problem related to callback and would be thankful to you if you can clarify.

I am using Cplex 12 with Java API to implement Benders decomposition. I have developed a correctly working implementation of Benders decomposition, and now trying to learn callback.

I am new to this concept and trying to learn from your java code example files (link given in another post of yours).

My master problem (MP) contains binary variables and one continuous variable (as an usual MIP problem), and the subproblem (SP) is a LP flow problem.

MP has some surrogate constraints to ensure feasibility of SP all time, so I did not require any feasibility cut so far.

When I restructured my code in the line of your files BendersExample.java and Benders.java I observed this:

MP solved -> SP solved correctly; then in the 2nd time - SP is giving infeasible solution! On a close look, I found that in the 2nd time values of the binary variables passed from MP have changed (expected) but they violate my surrogate constraints. How can it be possible? Does it mean that for the MP the solver has reached a node in the B&C tree which is infeasible (will be "pruned by infeasibility" eventually?) ? Also, does it mean that in implementation with callback I MUST keep option for adding feasibility cut, although in convensional Benders framework it is not required due to my surrogate constraints?

Thanks in advance for any help.

=======================

Jyotirmoy Dalal

If the surrogate constraints are sufficient to guarantee subproblem optimality, and if the master problem variable values are being passed correctly, there should be no need for feasibility cuts.

DeleteWhen you say that, on the second call to the subproblem, the master variable values violate your surrogate constraints, are you looking at those values in the callback (after they are obtained from the master but before they are passed to the subproblem) or in the subproblem? I'm trying to narrow down whether they are being generated incorrectly, obtained incorrectly, passed incorrectly or applied incorrectly.

Something to try is to record those incorrect binary variables and save any optimality cuts generated somewhere in your code. After the error occurs, add the stored optimality cuts to the original master problem model, and fix the binary variables by setting their lower and upper bounds equal to the recorded values. Then run the conflict refiner to see if CPLEX thinks they violate any constraints. If CPLEX thinks they satisfy all constraints (no conflicts), see if you can confirm violations by "hand" calculations.

Hi Dr Rubin,

ReplyDeleteI am trying to solve a stochastic MILP with Big M constraint, the integer variables(480 integer variables) are binary and it taking cplex a longtime to solve, do you have any reference known to you where I can decompose the problem probably using benders decompose to solve. whats the best way to handle this problem

I know there are papers on decomposing stochastic MILPs, but I don't happen to know the details of any of them. For a general paper on using Benders to eliminate Big M constraints, search for "combinatorial Benders cuts". The paper I have in mind is by Codato and Fischetti (mentioned in a reply above).

Deletethank you

DeleteHi Paul,

ReplyDeletehave u heard of the indicator constraint that can be used instead of big M in cplex. do you have any ideas on how to implement them.

How (and when) to use them is discussed in the CPLEX user's manual, in the section "Discrete optimization > Indicator constraints in optimization". For the most part, I think CPLEX turns them into big M constraints internally; occasionally it may implement them by branching logic, which avoids numerical issues with a really large M but results in an even weaker relaxation.

Deletehow do compute the minimally infeasible subsystem in an LP

ReplyDeleteThat depends on the software. With current versions of CPLEX, I usually use the conflict refiner.

Deletehi paul I have implemented indicator constraint, however I have a problem because x(i,t) which is my binary variable has logical precedence with x(i,t+1) which is also binary and cplex is not recognizing that

ReplyDeletehi Paul,

ReplyDeleteI have implemented combinatorial benders cut. However the solution it gives me is not optimal but feasible. I don't know whether I am missing a step in the Procedure. any thoughts

The most likely explanation would be an incorrectly formulated Benders cut (one that cuts off the optimal solution).

Deletedo you know how the separation problem works in combinatorial benders cut algorithm

DeleteYou pass it a candidate incumbent solution, it solves the subproblem and either (a) blesses the incumbent, (b) returns an optimality cut if the subproblem is feasible but the master problem solution incorrectly estimates the objective contribution of the subproblem or (c) return a feasibility cut if the subproblem is infeasible. Any optimality or feasibility cut should cut off the candidate solution while not cutting off the true optimum.

Deletemy problem is in form cx+dy as objective function, where x is solely binary and y is continuous however d=0. my master problem is purely binary and my slave problem just checks for feasibility. I don't think I need an optimality cut.Codato and fischetti never talked much about proving optimality

Deletemy problem is a two-stage stochastic optimization problem with big-M constraints.

DeleteYou are correct in saying that if your subproblem just checks feasibility, you do not need (and will not get) optimality cuts, only feasibility cuts.

DeleteI have started running benders(CBC) and it is very fast .however before I reach optimality the master problem become very difficult to solve because it goes from a small problem to a large problem after a lot of feasibility cut has been added to the master problem. do you know any ways in which I can get relax the master problem and get very good solutions.By the way I am using matlabs cplexbilp function.

ReplyDeleteTypically, when a large number of cuts pile up, some (many?) of them end up being redundant given subsequent cuts. With other APIs, you can tell CPLEX to allow cuts to expire after a while, but I don't think the MATLAB API supports that (not positive -- I don't use MATLAB). You might want to try removing older cuts from the master problem, while keeping a copy of them on the side. When you need to solve the subproblem, you can check the copies of the older cuts first to see if any are violated (should be faster than solving the subproblem), or you can just solve the subproblem and let it rediscover older cuts if necessary. Another strategy might be to check the slack produced by the latest solution in all cuts, add the new feasibility cut from the subproblem, and make room for it by deleting the old feasibility cut with the most slack.

DeleteI have also checked on the approach discussed earlier in this topic called branch and check. Will the branch and check guarantee optimality

DeleteI have not yet gotten around to reading the Thorsteinsson paper carefully and experimenting with the methods described, but based on a casual reading I would say that a properly executed branch-and-check strategy should guarantee optimality (assuming that the problem has an optimal solution).

DeleteDear Dr. Rubin,

ReplyDeleteThis article helps me to understand about callback function. Currently, I am using Benders Decomposition (BD) algorithm in my research. After I checked this article, I tried to use callback function. However, the performance (solution time) of modern approach is worse than those of classic approach. I tried to test the modern approach with different CPLEX options, but still classic approach is better.

It looks modern approach has more advantages so that it should have better results. How can I explain this results? Thank you for your comment in advance.

In some cases, I think this can happen. Adding cuts to a model alters the search tree, and thus alters the path of the solver. A given cut can cause the solver to switch from a path that would have found a very good or optimal solution quickly to a path that takes longer. Of course, a cut can just as easily do the opposite - switch the solver from a slower path to a quicker path.

DeleteSo it may be that the one or more of the cuts that you are adding in the callback are creating a tree that, by random luck, finds an optimum more slowly than does the tree used in the classical approach. The callback approach will not always be faster, but I think it will _usually_ be faster.

You might want to look at the cuts added when you use callbacks just o make sure they are valid and as tight as you can make them.

Hi Prof Rubin,

ReplyDeleteYou mentioned one can optionally create multiple subproblems but on the condition that the subsets of y should be disjoint (in the third paragraph). For my case, I have several subsystems coupled and there are binary variables involved. So I was thinking about using the framework of Benders' decomposition to first decompose binary variables in the master problem. And for the subproblem, I still want to solve even smaller problems in a decentralized way by decoupling the joint y. Do you think if this idea is applicable in the framework of Benders' decomposition or not by your experience?

Thanks a lot!

Jane Zheng

I've thought about this a little in the past, but mercifully have never had to implement it. If I understand your situation correctly, at least two approaches suggest themselves, both starting with Benders decomposition of the original problem. In one, the subproblem is a MILP which is itself processed by Benders. So you have the master master problem (MM), the master subproblem (MS) and the subproblem subproblems (SS1, SS2, ...). The other approach that comes to mind has a master problem and a MILP subproblem (I'll call them MM and MS again), but MS is solved by Lagrangean relaxation, with the linking constraints relaxed (so that the relaxation of MS decomposes into SS1, SS2, ...).

DeleteBoth approaches should work in theory. In practice, what would worry me is the cost of solving MS to optimality (or near-optimality) each time MM spits up a candidate incumbent. If the candidate incumbent makes MS infeasible, and if you detect that fairly quickly, that will help. If the candidate incumbent makes MS feasible but with an objective value worse than proposed in MM (meaning you need an "optimality cut"), then you have a tactical question: do you solve MS to optimality (producing the strongest optimality cut possible); or do you stop MS as soon as the objective value is worse than proposed by MM, generating a weaker cut (and possibly forcing more nodes in MM to be explored) but generating the cut faster?

Also, I'm not sure what the feasibility and optimality cuts would look like. It's possible that the feasibility cuts in MM would have to be "no good" cuts (excluding the combination of binary values in the incumbent but not excluding any other combinations), which might be a bit weak.

This will all depend on both the specific problem instance and how you implement the algorithm, so I think the only way to decide if either approach will work is to try it.

Hi Prof. Rubin,

DeleteThanks a lot for your reply!

I guess I will try it for my problem and see if/how it works. And one more question about implementation of Benders' decomposition. Since my former algorithms are all implemented in MATLAB, is it possible to implement Benders' decomposition also in MATLAB? I saw in another post you mentioned it is not possible to have control callback in MATLAB at least for modern Benders' decomposition.

Thanks.

Jane Zheng

Jane,

DeleteI'm not a MATLAB user, so I really can't answer that. You didn't specify the solver, but if it's CPLEX, you can ask on the CPLEX forum (see the IBM ILOG Optimization Forums link in the right margin if you're not already active there). I imagine Gurobi and FICO (XPRESS) have similar forums.

Dear Prof. Rubin,

ReplyDeleteI have some problems in really understanding the differenence betweend the modern and classic approach. First of all i wanted to ask you if there is any scientific reference for this modern approach? Second how is it possible to only generate one single mip search tree in the modern approach? i mean after finding an incumbent it generates cuts and solves the master anew like in the classic approach? or how is the new incumbent chosen?

thank you really much!

I don't know of any papers that explicitly compare the single tree approach to the original approach, but it's entirely possible they are out there and I simply have not come across them.

DeleteAs to how the single tree works, the key is that you interrupt the branch and bound process when a candidate solution (potential incumbent) is found, check that solution (in the subproblem), and then come back to the current node in the single tree. If the incumbent was accepted by the subproblem, you continue with branch and bound. If it was rejected, you add the new cut(s) to the current node and to every other node that is still alive in the tree. (It is possible to generate cuts that apply only to the subtree rooted at the current node, but the typical approach generates globally valid cuts). The new cut makes the proposed incumbent infeasible and causes the solver to update the LP solution at that node and then proceed (pruning, branching or possibly finding another candidate incumbent there). Meanwhile, the addition of the cut to the other surviving nodes may cause some of them to be pruned, either because they are infeasible or because their bound has tightened and no longer improves on the current incumbent.

New incumbents are chosen the same way they are in any MIP: either a node (possibly the same one that generated the cut) has an integer-feasible solution to its LP relaxation, or a heuristic used by the solver stumbles on an integer-feasible solution.

Dear Pr. Rubin,

Deletethank you so much for your quick and kind response! it really helped me a lot with understanding the modern approach!

Best regards!

Dear Prof. Rubin,

DeleteI really like your articles, which helped me a lot in understanding how the modern BD with callbacks works. However, I still got a little doubt.

My question is, will the cuts (both feasibility and optimality) used in the "old" BD and in the "modern" way with callbacks be in the same format? Should we somehow alter the cuts in order to use them in the modern approach with single B&C tree? To be specific, in the modern BD with callbacks, if we generate the same-format cuts for only a feasible master solution, is it possible that these cuts can somehow cut out the optimal solution in the single B&C tree?

Thanks a lot for your articles and looking forward to your reply.

Best regards!

Yes, the format/derivation of the cuts is the same, and no, there is no danger that cuts derived from a feasible but suboptimal solution to the current master will cut off the optimal solution. The proof of validity of either type of cut does not depend on the proposed incumbent being optimal (in the current master) or not.

DeleteDear Prof Rubin,

ReplyDeleteI have got immense help from your blog and code example for callback-based Benders decomposition.

I have successfully implemented it in a project before. I use CPLEX12.6 and Java.

At present, I am working with a little differet

location-allocation type model where some of my continuous flow variables are fractions (in order to represent

what fraction of customers get satisfied by a supply node etc.). Therefore, in my model I have a lot of x_ij <= 1 constraints

and obviously there is a term corresponding to it (dual variables' sum over all i,j ) appearing in Benders cut.

When the instance is small, there is no problem but for large instance, I guess - due to the huge number of additional "x_ij <=1"

constraints, I am getting java heap space error while constructing the subproblem itself. Is there any way out?

I also thought to specify the bounds at the variable declaration stage, but then extracting the dual vector does not seem possible.

Any suggestion is welcome. If you want to take a look in my code, I'll be happy to share.

Regards

Jyotirmoy Dalal

Jyotirmoy,

DeleteI think you had the right idea: set the upper bound of x_{ij} to 1 and skip the corresponding constraint. As for the dual value, you can get that by (a) checking whether x_{ij} = 1 in the master solution and (b) if so, querying CPLEX for its reduced cost (which will be the dual value). If x_{ij} < 1, the dual value is zero. I covered that in a long-ago post: http://orinanobworld.blogspot.com/2010/09/dual-values-for-primal-variable-bounds.html.

Dear Prof. Paul,

ReplyDeleteI went through your code and its quite impressive to me. I want to solve a particular problem and want to modify your code but some how I am not able to do. It will be great help to me.

Problem:

Let say I am having 3 scenario for 5 X 5 transportation problem

Scenario-1: 0 , 1, 1, 1, 1

Scenario-2: 1 , 0, 1, 1, 1

Scenario-3: 0 , 0, 1, 1, 1

1 indicates warehouse can supply and 0 indicates warehouse cann't fulfill the demand due to some reason but any of the cases the setup cost will be considered whether the warehouse can supply or not?

the ultimate goal of my problem is find out the minimum expected cost when these 3 scenario will occured?

Will it be possible to fit this problem in Benders decomposition algorithm.

I change the code in sub problem but in below (commented SCENARION CONSIDERED) but not able to change in your BendersCallBack class and in Solve() method too..

ReplyDelete// set up the subproblem

ship = new IloNumVar[nS][nW][nC];

IloLinearNumExpr expr = sub.linearNumExpr();

for (int h = 0; h < nS; h++) { //SCENARIO CONSIDERED

for (int i = 0; i < nW; i++) {

for (int j = 0; j < nC; j++) {

ship[h][i][j] = sub.numVar(0.0, Double.MAX_VALUE, "Flow_" + h + "_" + i+ "_" + j);

expr.addTerm(unitCost[i][j], ship[h][i][j]);

}

}

}

sub.addMinimize(expr, "FlowCost"); // minimize total flow cost

// constrain demand to be satisfied -- record the constraints for use later

cDemand = new IloRange[nC];

for (int h = 0; h < nS; h++) { //SCENARIO CONSIDERED

for (int j = 0; j < nC; j++) {

expr.clear();

for (int i = 0; i < nW; i++) {

expr.addTerm(1.0, ship[h][i][j]);

}

cDemand[j] = sub.addGe(expr, demand[j], "Demand_"+ h +"_" + j);

rhs.put(cDemand[j], master.linearNumExpr(demand[j]));

}

}

// add supply limits (initially all zero, which makes the subproblem infeasible)

// -- record the constraints for use later

// -- also map each constraint to the corresponding binary variable in the

// master (for decoding Farkas certificates)

cSupply = new IloRange[nW]; //NUMBER OF SCENARIO WILL BE MULTIPLIED

for (int h = 0; h < nS; h++) { //SCENARIO CONSIDERED

for (int i = 0; i < nW; i++) {

cSupply[i] = sub.addLe(sub.sum(ship[h][i]), 0.0, "Supply_" + h +"_" + i);

rhs.put(cSupply[i], master.prod(capacity[i] * scenario[h][i], use[i]));

}

}

I'm not sure precisely what you are asking, but if you're asking me to look at your code and then work out what changes you need to make in my callback class and solve method, I'm afraid I must decline due to lack of time.

DeleteDear Dr. Rubin,

ReplyDeleteDo you know when people started using the "modern" implementation for Benders Decomposition in the cases of MIP master problem? Did Cplex have this callback function at their beginning, e.g., 1990s, or 2000s or 2010s? I am just curious the history of implementation evolution. Thank you very much.

Phil

Phil: I'm a relatively recent user of Benders decomposition (although I had come across Benders's original paper quite a while back), so I can't really give you a satisfactory answer. I'm pretty sure that the lazy constraint callback, which makes Benders easier, is a relatively recent addition (somewhere around version 10?). Prior to that, if I recall correctly, you had to combine an incumbent callback (to test and possibly reject new incumbents), a branch callback (to tell CPLEX what to do with the node if the incumbent was rejected) and a cut callback (to add the Benders cuts).

DeleteI think control callbacks are a comparatively recent addition to CPLEX (maybe since 2000), but I can't really recall when I first saw them. The CPLEX folks would know, of course.

Dear Prof. Rubin,

ReplyDeleteLooks like quite a number of years has passed from the time you introduced this modern approach. Therefore, I wonder whether it is possible to summarize when it is superior against the traditional one. It would be nice to have any general recommendations or shared experience. The trade-off between reconstructing the searching tree in the traditional method and running redundant sub-problems in the modern one is clear. Perhaps, the modern approach is to be faster when the master problem remains large enough and sub-problems are tiny. I expect that the traditional approach should be faster when the sub-problem is very costly in running time. For example, when a problem consists of several NP-hard problems and one deals with the so-called logic-based Benders decomposition, like in VRP with loading (bin packing) constraints.

All the best.

Sergey:

DeleteThanks for the comment. I agree that guidelines for choosing between the methods would be helpful, but I for one can't offer any. Experimental comparisons on problems with specific structures would be very helpful, but I'm not aware of any that have been published. (Note to assistant professors out there with idle computers and a need for papers: this is somewhat low-hanging fruit!)

I cannot offer any such comparisons because I've never run the same model both ways. When callbacks became available, I assumed that the "modern" approach would usually be faster, so I started using it (exclusively). It's been decades (sigh) since I last used the traditional approach.

There are actual four variants to consider, a 2x2 design in which "traditional" v. "modern" is one factor. The other factor is whether subproblems are solved only when integer-feasible candidates are found, or at every node. The latter sounds like excessive work to me, but I've seen questions and comments posted on various forums indicating that some people do in fact try to generate Benders cuts at pretty much every node (perhaps by rounding a fractional LP solution).

For now, I'm afraid it remains guesswork as to which approach to take.

Cheers.

Can this blog be cited?

ReplyDeleteYou are certainly free to cite it, although I'm not sure anyone would consider it an authoritative source. There's a post on a blog hosted by the Public Library of Science (PLOS) that gives what I think are reasonable style suggestions: http://blogs.plos.org/mfenner/2011/07/21/how-to-formally-cite-a-blog-post/.

DeleteThanks..Dr. Rubin..I am impressed by your energy.

DeleteThank you! Positive feedback contributes to said energy.

DeleteDear Dr Rubin,

ReplyDeleteYour post is very enlightening as always. One question, in my model, the part of the objective that has to do with y variables takes the form of a convex function (something like y^{-2} + y^{-1}). How should I modify the procedure to generate the cuts. Do you have any suggestion or reference? I 've read the Geoffrion paper but it is not entirely clear to me.

Thanks a lot.

I'm afraid that I've forgotten most of what I knew about nonlinear convex optimization. I assume you are minimizing in the subproblem. One possibility (not necessarily the best) is to approximate $f(y)=y^{-2} + y^{-1}$ (or whatever your subproblem objective function is) by a convex, piecewise linear function. Since $f$ is convex, that piecewise linear function can be written as the maximum of a bunch of linear functions, which turns the subproblem into a linear program. Since you are only approximating the actual objective, there is no guarantee you will get the true optimum, but chances are good you will get close (and, if necessary, you can try tightening the approximation).

DeleteYes, I am minimizing the objective. Indeed I am using the approximation approach in another model. I think this would fit quite well with what you suggested in this post.

DeleteThank you very much again.

Dear Hoa,

ReplyDeleteGeneralized Benders Decomposition or Outer Approximation is generally used for such problems (ref: Chapter 6, Nonlinear and Mixed-Integer Optimization_ Fundamentals and Applications by Floudas).

Regards,

Sachin

Thank you very much, I will read the reference you suggest. As I said earlier, I mostly read the Geoffrion paper for the Generalized Benders Decomposition but it is not very clear to me.

DeleteBest regards,

Hoa

(disclaimer: I didn't read all comments).

ReplyDeleteProf. Rubin was such a big help when I was doing my PhD, 10 ou 12 years ago. Now I get pointed to this post by a PhD student of mine.

Summary: 1) Being an academic is fun. 2) Now, being a PhD supervisor and a "true" doctor, I really can't understand how he finds time for this. 3) I still own Dr. Rubin and now doctor Ceriola a beer.

( https://groups.google.com/forum/#!search/benders$20decomposition$20alysson$20rubin/sci.op-research/sGhIyksVDdk/G-U1kQZSiv0J )

Alysson, nice to hear from you! Although this reminds me of the first time I discovered that a friend of mine had become a grandfather. Makes me feel old! I'm four years retired, so finding time for this is a bit easier than it was back in '03. Once you're a full professor, with less pressure to publish all the time, you may find some slack in your schedule (particularly if you're good at ducking committee work).

DeleteHi Professor Rubin,

ReplyDeleteI'm applying benders decomposition in Xpress-Ive for a crew scheduling problem. I have lots of consecutive constraints hold binary and integer variables. Due to the structure of constraints;

1) I have always getting feasibility cut

2) I'm using bigm method to getray from dual sub problem. Ive tried using C++ but getray function didn't work for my case.

3) I'm trying to use call backs.

I wonder that with callback do I need to stop solving master problem for every integer feasible solution found and solve dual with the feasible one rather than waiting for optimal solution of master problem.

Thanks for this great blog, it's been so good for knowledge.

Regards

It's been quite a few years since I used Xpress, and I've forgotten what callbacks they support. If they have something equivalent to CPLEX's lazy constraint callback, then using it does not stop the solver when an integer-feasible candidate is found, it just pauses the solver.

DeleteDo you need to do this as opposed to solving to optimality, adding a constraint and repeating (the "traditional" way)? No, it's your choice. I think the traditional method is sometimes more efficient. (I suspect that occurs when the master problem is "easy" to solve and the subproblem is "hard".) My subjective, anecdotal, unsupported by rigorous analysis opinion is that the callback ("one tree") method is _usually_ more efficient, and so I default to using it.

You mentioned a "big M" method (in the subproblem?). If you are struggling to improve performance, you might consider using "combinatorial Benders cuts" in lieu of the big M method. Again, sometimes it's faster, sometimes not, depending on how tight your values for M are.

Lastly, you're very welcome. Thanks for the kind words. :-)

Thanks for your reply Professor. I've managed to implement lazyconstraint callback, to search in one tree and add combinatorial cut when it's needed. It solves much faster than before (without callbacks and comb cuts) for small sized problems but it still has problems to reach optimal value for medium and large scaled problems. with one tree approach it gives memory error since it has spent long times to find a new incumbent after 20th integer solution. I used time limit criteria to prevent this situation and mixed classical benders with one tree before finding the optimal value.in this case, it ended up being stuck around %30 percent gap and keep giving similar values for lower bound in every iteration. Do you have any suggestion or diagnosis for this situation.

DeleteIt's hard to diagnose a slow moving bound. Sometimes it reflects a "loose" formulation, but sometimes it happens and there is not much you can do to tighten the formulation. You might do a little searching to see if someone has developed facet defining cuts for a problem similar to yours, and if so try adding them. There will also likely be some solver parameters you can try adjusting. (Again, I'm out of touch with Xpress, but I know both CPLEX and Gurobi are tunable, so likely Xpress is too.)

DeleteIf the out-of-memory error is due to the tree occupying too much memory, you can always reduce the memory footprint by switching to depth-first search (which may extend the solution time considerably). There may also be a parameter that lets you adjust when backtracking occurs, so that you can shift the search a bit toward depth-first (and away from breadth-first) without going totally depth-first. The less often you backtrack, the less memory you use (maybe).

Dear Prof Rubin,

ReplyDeleteIn a lazy constraint callback based BD implementation for minimization problem, if I want to early-stop cplex using an optimality gap% criterion (setting epgap), and on termination extract the UB and LB values using cplex.getObjValue() and cplex.getBestObjValue() functions, am I not extracting UB and LB of the MP? It may be possible that the overall problem has a different UB (of higher value)? So, won't it be a termination earlier than I intend? If yes, what would be the correct way to early-stop a lazy constraint callback based BD by specifying the epgap parameter? Is there any mechanism to tell cplex to continue in a situation where the MP's bounds are smaller than epgap, but overall problem's bounds are still bigger than epgap? Please advise.

Yes, the upper and lower bounds come from the master problem. If some candidate solution survives the callback (no cut generated), it becomes the new master problem incumbent. As long as at least one incumbent has been found, the best (most recent) one will give you the upper bound, and that bound will be valid -- the objective value of the optimal solution will not be any larger than the objective value of that incumbent. If you have not yet found an incumbent, I believe getObjValue will throw an exception. No, you will not get premature termination.

DeleteHi Prof Rubin,

ReplyDeleteWhat if there exist integer variables in subproblems? Is it possible to obtain an extreme ray if subproblem is unbounded like LP? If not, how to generate a Bender cut in this case? Thank you very much.

Sorry, I somehow did not get notification of this comment being posted. If the subproblem contains integer variables and is unbounded, then its LP relaxation is unbounded. So if you know the subproblem is unbounded, you can relax it and get a cut the usual way. The catch is that the LP relaxation can be unbounded without the original subproblem being unbounded. Unfortunately, I'm not aware of what research (if any) there is about that case (or how you know that the subproblem is truly unbounded).

DeleteDear Prof Rubin

ReplyDeletehi

upper bound in benders decomposition dosen't change or change in long time, do you know why?

There are a lot of possible reasons, and possibly more than one applies. (Also, it's hard to answer a question like this without knowing if you are maximizing or minimizing.)

Deleteit's minimizing. i am working on scheduling problem, forexample upper bound dosen't change for 100 iteretion , what should i do to became faster?

DeleteThe upper bound in a min problem is the objective value of the best known feasible solution. You can look for solver options that emphasize finding feasible solutions (perhaps by applying heuristics at more nodes), or you can try designing and running a heuristic before starting Benders decomposition, to find a good starting solution (which you would pass to the solver).

DeleteI am trying to implement BD with start solution and regularization by simple adding a quadratic term to the objective function. When using however a start solution to initialize my master problem, the algorithm overestimates the final cost. Do you change the algorithm anyhow when using start solutions?

ReplyDeleteNo, start solutions do not change the algorithm (provided that either the starting solution is feasible or that it is tested in the subproblem and the subproblem is given a chance to reject it).

DeleteDear Prof. Rubin,

ReplyDeleteCould you please let me know if Cplex' Automatic Benders adds Benders cuts only at integer nodes of the Branch&Bound tree, or also at fractional nodes?

Is there a way to control the nodes (integer versus fractioanl) where to add Benders cuts?

Regards,

Sachin

CPLEX can detect a potential incumbent solution via heuristics, even at nodes with fractional solutions. In both the the "legacy" and "generic" callback systems, lazy constraint callbacks are definitely triggered in those cases, and I would assume that would be true in automatic Benders as well. So I'm fairly sure Benders cuts can be added at nodes with fractional solutions. The only way I can think of to avoid that would be to turn off all heuristics.

DeleteDear Dr. Rubin,

ReplyDeleteI am thinking about using BD for a MIP model. It has two types of Binary variables, both connected to each other through a constraint. The model also has continuous variables (part of disjunctive constraints, modeled with bigM) which in turn depend on both sets of binary variables. The objective function involves minimizing a function of continuous variables. Is there a way to find out beforehand whether BD will be a good approach to try? The number of integer variables in the model runs into tens of millions. An alternate question could be, are there problems for which BD does not work as well? How can one answer that question without lengthy experiments and just by "looking" at a MIP model?

In general, I do not know any way to look at a problem and decide if Benders is the "right" way to model it. Generally speaking, if there would only be one subproblem and if a single model would not exceed the solver's memory capacity, I would probably try a single model first. If the problem decomposes into several / many subproblems, maybe I would try Benders first.

DeleteYou mentioned using big M constraints, which to me is a special case. If I know valid choices for the M parameters that are "reasonably small", I would be inclined to try a single model. On the other hand, if I find myself using "not quite infinity" values for M in order to be sure they are valid, I would definitely try Benders, since it gets rid of the M values.

Bottom line, though, is that these are just personal tendencies. I don't think there is a way to decide with any high degree of certainty just by "looking" at the model.

Thank you Dr. Rubin for the response. As for the first criterion you mentioned, the model is too large with 7 million binary variables for an instance. The single model does exceed the solver's memory capacity. However, I cannot find an intuitive way to decompose the model.

DeleteHere is the model using the notation from paper by Codato and Fischetti on combinatorial BD:

min d^t*y

s.t.

Ax <= a

Bz <= b

Gz + Hx = g

Cx + Dy >= c

Ez + Fy >= e

x,z = bin, y >= 0

As suggested by Codato et. al., I could separate the part with continuous variables and disjunctive constraints into a subproblem. It would still leave three issues: first, the master problem would still contain a large number of binary variables and it may be difficult to get an integer solution using a solver. Especially constraints related to Bz <= b constitute a VRP with multiple precedence and coordination constraints. Second, The objective function will need to be introduced into the slave problem as a constraint of form d^t * y <= UB - e. This has an issue of finding a sufficiently small value of parameter e which does not cut any feasible master problem solution. I have observed multiple integer feasible solutions with objective very close to UB. Final issue relates to the CB cuts added to the Master problem in each iteration. Cutting one solution at a time can introduce a large number of constraints into the master. The master involves, for instance, choosing 100 "x" variables as 1, out of 6 million. I am afraid the number of integer feasible solutions in the Master could be too large to be cut off by CB cuts.

Regarding the second criterion, the big-M is not quite infinity and can be bounded by a reasonably small value. However, solving the problem as a single model still does not work for large instances.

If the M values are "reasonably small", you do not need to use the combinatorial form of Benders from Codato & Fischetti; you can use the original form of Benders. That should eliminate your concern about putting the original objective function in the subproblem in constraint form. As far as decomposing so as to reduce the number of binary variables in any one problem, I don't see any easy fix. You could try asking on OR Stack Exchange (a link to which is in the right margin near the top of the post).

Delete