Sunday, September 23, 2012

Separable Benders Decomposition

A reader (I have readers??) asked me a question about Benders decomposition of a mixed-integer program (MIP) when the linear program (LP) subproblem is decomposable: how do we glue together a mix of optimality and feasibility cuts from the subproblems? The short answer is that we don't.

Rather than repeat a lot of definitions, I'll just refer you to a previous post for some background (if you need it; chances are you don't, because if you're not familiar with Benders, you've already tuned out). I will repeat the formulation of a generic decomposable MIP, just to establish notation:\[ \begin{array}{lrclcc} \textrm{minimize} & c_{1}'x & + & c_{2}'y\\ \textrm{subject to} & Ax & & & \ge & a\\ & & & By & \ge & b\\ & Dx & + & Ey & \ge & d\\ & x & \in & \mathbb{Z}^{m}\\ & y & \in & \mathbb{R}^{n} \end{array} \]The MIP decomposes into a master MIP (containing \(x\) and a real variable \(z\) that serves as a surrogate for \(c_{2}'y\)) and a subproblem (LP) containing \(y\).

Now suppose that \(B\) and \(E\) are both block-diagonal, with the same block structure and with \(K\) blocks. We can decompose the subproblem into \(K\) smaller LPs. In the master problem, we replace \(z\) with \(z_1+\dots+z_K\), with \(z_k\) the surrogate for the objective contribution \(c_{2k}'y_k \) from the \(k\)-th subproblem.

When we obtain an incumbent solution \((\hat{x},\hat{z}_1,\dots,\hat{z}_K)\) in the master, we pass \(\hat{x}\) to each subproblem and solve. Some subproblems may be infeasible (generating feasibility cuts); others may have optimal solutions with \(c_{2k}'y_k\gt\hat{z}_k \) (in which case we generate an optimality cut); and some may have optimal solutions with \(c_{2k}'y_k=\hat{z}_k \), requiring no action. If any subproblem is infeasible, then \(\hat{x}\) is infeasible in the original problem regardless of what goes on in other subproblems, and the feasibility cut from the subproblem is a valid feasibility cut in the master. Similarly, if \(\hat{z}_k\) underestimates the objective value in subproblem \(k\), then the optimality cut generated there is valid in the master regardless of what happens in other subproblems. So each feasibility/optimality cut generated in a subproblem can be added to the master problem without modification. There is no need to combine cuts (and, in fact, combining cuts might weaken them).

I'll close with two comments. First, although Benders is usually taught as "solve master, solve subproblem, add one cut, repeat", adding multiple cuts during one iteration is certainly legitimate and possibly beneficial (assuming, as in this case, that you discover multiple valid cuts). The cuts need not all be a single type (optimality or feasibility); a mix is perfectly fine. Second, while it is fine to solve every subproblem at every iteration, you may also opt to solve the subproblems in some order (perhaps cyclical?), abort the cycle the first time a subproblem generates a cut, and add just that one cut. This speeds up each cycle but delays finding cuts. I really do not know which way is better, but I suspect that optimality cuts generated from a value of \(\hat{x}\) that is found to be infeasible in another subproblem may not be all that useful, and multiple feasibility cuts lopping off the same \(\hat{x}\) might be a bit redundant. So I would be inclined to abort the cycle the first time I got a feasibility cut, and add just that cut, but add all the optimality cuts I found if no feasibility cut was generated.


  1. You should also mention that this is not only correct if x is an integer-valued variable. This is also interesting for pure stochastic linear programming and also for more general settings, e.g. where the master problem is a MIP or even a nonlinear problem.

    Best regards,

  2. Thomas,

    Thanks for pointing that out. You are of course correct (and Benders originally intended his approach for an NLP/LP combination).

  3. This comment has been removed by the author.

    1. I found that solving the decomposable sub-problems as a whole LP but still dividing the Zs into constituent Zs worked out much better. Having Zs like this helped in faster convergence. In my problem, there wasn't clear decomposable sub-problems. Few variables were common to more than one sub-problem. But this still helped in getting solutions very close to optimality!


    2. If the subproblem was not decomposable, how did you split the objective contribution ($c_2^\prime y$) into components ($z_j$)?

    3. My sub-problem is as follows (using slightly different notations):

      Maximize c2' * y
      Subject To:
      A*y <= B (Itinerary demand constraints)
      C*y <= D (Leg capacity constraints)

      Each leg capacity constraint could be decomposed into a Zj. But there are few ys that are common in more than one leg. So it's not very cleanly decomposable. Following would be a Benders cut for each leg:

      Zj <= Sum{A * Pd * Yj} + Sum{C * Pc * Xj}

      Pd is the dual of the demand constraints
      Yj is the primal solution for the list of y variables present in the leg j
      Pc is the dual of the capacity constraints
      Xj is the list of x variables associated with leg j

      This way of decomposing the Zj helps in solving larger problems very fast. But optimality is not guaranteed always. I get within 1% of the optimal solution most of the time.



Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.