## Tuesday, February 8, 2011

### Bounding Dual Variables

An interesting question came up on sci.op-research today: can we (easily) determine bounds for the variables in the dual of an linear program?  If the goal is tight bounds, I don't see any easy solution; but it turns out that, at least in some problems, there may be finite (reasonable?) bounds to be had cheaply.

Let's start with a primal problem in canonical form (alternative forms are left to the reader as an exercise):$\begin{array}{lrclr} \textrm{minimize} & c'x\\ \textrm{s.t.} & Ax & \ge & b & \textrm{(P)}\\ & x & \ge & 0\end{array}$ with dual$\begin{array}{lrclr} \textrm{maximize} & b'y\\ \textrm{s.t.} & A'y & \le & c & \textrm{(D)}\\ & y & \ge & 0\end{array}.$To get an upper bound on dual variable $y_j$, we could solve a modified form of the dual problem:$\begin{array}{lrclr} \textrm{maximize} & y_j\\ \textrm{s.t.} & A'y & \le & c & \textrm{(D*)}\\ & y & \ge & 0\end{array}.$This looks for the largest value of $y_j$ (which we hope is finite) consistent with the dual constraints. The dual to this problem is:$\begin{array}{lrclr} \textrm{minimize} & c'x\\ \textrm{s.t.} & Ax & \ge & e_j & \textrm{(P*)}\\ & x & \ge & 0\end{array}$where $e_j$ is a vector whose $j$-th component is 1 and whose other components are all 0.

Now suppose we know a primal vector $\tilde{x}\ge 0$ for (P) such that $A\tilde{x}=h\gg 0$, where $h\gg 0$ means that $h$ is strictly positive in every component. Let $x^*=(1/h_j)\tilde{x}$. Then $x^*$ is feasible in (P*), which means $c'x^*$ is an upper bound for the optimal value of (P*), and therefore also an upper bound for the optimal value of (D*). So we can safely assert $y_j\le c'x^*$ in (D).

If $b\gg 0$, then any feasible solution to (P) is a candidate value of $\tilde{x}$, and the closer $\tilde{x}$ is to optimal, the tighter the bound on $y_j$. Also note that a single $\tilde{x}$ provides bounds for all of the dual variables $y_j$. So the trick is to know a vector $x\ge 0$ for which $Ax\gg 0$.

1. Yes, you can use any primal feasible solution to bound the dual variables. It's a version of dual presolving techniques.

Bounds on the dual variables are used to reduce the problem size, but I think the OP had other usage in mind. Of course you need a good primal solution and even then the bound may still be very weak.

In regard to standard presolving methods, I am not sure you can postsolve these reductions into a optimal basic solution, without the need to reoptimize after postsolve.

2. I wasn't sure what the OP was after. He seemed to be talking about a "big M" dual formulation, which didn't really make sense to me.

3. No it was very unclear what he meant, I did not get it either. He might also just talk about sensitivity analysis, though I am not sure.

Anyways primal-dual relations are always beautiful theory in my opinion :-)

4. In OR practice, we can cheat and do something simple. We don't want the above canonical form in an app (to avoid seeing "infeasible"), so

Min C.x + M.s
A.x + s >= b
x,s >= 0

which gives an equally nice bounded dual problem to work with:

Max b.y
yA <= c
0 <= y <= M

Pick M to be the smallest practical violation penalty that we can get away with safely, which also gives an intuitive meaning to the tightest dual upper bound. Typically the worst-case value for M is really large, but a tiny fraction of that often works.

5. I am the guy who proposed this question in OP. Thank you all for the answers. I have better understanding now and will think more about my specific applications.

BTW, in "Top Ten Secrets to Success with Optimization" by Dr. Brown (http://faculty.nps.edu/gbrown/browngpu.htm), one rule is to always bound the dual variable. I feel sometimes it is difficult to get some meanings for dual variables. In the shortest path problem and maximum flow problem, the dual variables have nature meanings. But for a complicated formulation, how can one get the meaning?

6. Having skimmed Dr. Brown's paper (thanks for the link), I'm not sure that bounding the dual variables is a goal so much as an outcome of relaxing constraints with penalties. What Dr. Brown is talking about there is similar to Milan Zeleny's "de novo programming" (http://www.milanzeleny.com/documents/literary_works/Optimality.pdf). Is there any virtue to explicitly bounding the dual variables when the model formulation causes them to be bounded implicitly (meaning you don't know the bounds but you do know the bounds are present)? I'm assuming that you are actually solving the primal (whether by primal simplex, dual simplex or other algorithm); if you are explicitly solving the dual, then maybe bounds on dual variables have a computational impact.

Regarding an interpretation of the dual variables, they always represent the marginal change in the objective function per unit change in the RHS of a particular constraint. What that means is of course model-specific. Consider an production planning problem (cost minimization) with an inventory balance constraint, written as "opening inventory + production - shipments - closing inventory = 0". The dual variable is the marginal cost per unit of product consumed by something other than shipment (pilferage, the dog ate it, ...). Write the constraint in the opposite order, and the dual variable is the marginal cost per unit of inventory deposited by the Good Inventory Fairy when nobody was looking.

7. @Shiva: What you're suggesting is precisely what Gerald Brown was talking about. I'd be interested to see whether there is any evidence that this has any significant effect (in either direction) on computational speed for models that are inherently feasible.

8. The modelling trick Shiva suggested is not always a good idea. If you think about it, you are actually trying to fix a otherwise infeasible model.

One could argue it should be split up in a two stage model, one dealing with the feasibility part and one for the optimality part. It is very convenient for the user to patch small model infeasibility, but I have seen many cases where it's misused and high penalties are used. It can pose a badly scaled model and numerical issue may occur.

It will never enter the basis you say ? Well don't be so certain, in presolve a coloumn singleton may be substituted for a costed slack.

In general I think one should sit down and really think about why the model is infeasible and try to learn from that. Though there are places where such a trick is perfectly OK.

9. @Paul: Thank you for explanations. Now consider a LP model whose the lower and upper bounds of objective are known (say due to application itself): L and U. Dual variables can be interpreted as marginal rate of changes. Can I say any dual variable is between [L-U, U-L]? Thank you.

10. @Bo: I agree with what you're saying about infeasibility. If the primal model is infeasible, I want to know why; I'm not looking for a workaround. The "de novo programming" variant of this would attach actual, context-determined penalty costs to violations (with some bounds on the allowed violations). Consider, for example, a production model with a constraint that says total consumption of raw material R must not exceed some limit b. Zeleny's notion is to challenge why b is the limit. Often the answer is "well, we've always been limited to b before". So the inserted variable (a surplus variable in this case) represents a supply of R in excess of b obtained somehow, and M is the actual cost of acquiring the extra material. If the optimal solution sets a strictly positive value for the surplus variable, it's a message to the owners of the model to consider expanding their supply of R. You can do the same by examining dual variable values, but that requires extra effort after the fact that may be overlooked, whereas Zeleny's approach puts in the extra effort up front. (I'm not saying I'm a practitioner of de novo programming, just pointing out that it is one way to challenge assumptions, which I think is generally a good thing.)

11. @JQ: No, I don't think you can make that argument in general. First off, if L and U are bounds for the objective value _given the current RHS_, that says nothing about bounds on the objective after we modify the RHS.

Let's say the model objective is cost minimization and the constraint is demand for product P (which must be met), and that we somehow know that for any feasible RHS the cost must be between L and U. The shadow price lambda is marginal cost per unit demand for P. Do we know anything about the size of lambda (other than its sign)? No. For a positive increase w in demand for P, cost goes up by exactly lambda*w if w is sufficiently small and at least lambda*w if w is larger. So lambda can be arbitrarily large (or for that matter arbitrarily small in magnitude), and all that tells us is that if adding lambda*w to the current optimal cost would exceed U, then it must be infeasible to add w to the current demand for P.

12. Bo said " ... The modelling trick Shiva suggested is not always a good idea. If you think about it, you are actually trying to fix a otherwise infeasible model.."

1. 100% true Bo. I would try the simple idea first and go to a 2-phase approach if this doesn't work. Shocking as this may sound, mission-critical large-scale apps run 365 days a year in the airline industry with this simple idea without much problem, but there are rare days when u hear the occasional grumbling :)

2. often constraints can be violated by small amounts, and often customers don't tell me that unless i ask them.
sorry to post this link, but i always remember this movie scene during project requirements time!
http://dualnoise.blogspot.com/2010/11/or-practice-tip-find-and-eliminate.html

13. Paul enquired: "... I'd be interested to see whether there is any evidence that this has any significant effect (in either direction) on computational speed for models that are inherently feasible..."

there is empirical evidence i've seen - yes, (e.g., large-scale airline crew scheduling) - quicker convergence of a column generation procedure using common-sense driven penalty values, where the duals are the most important drivers.

14. @Shiva, I agree there are situations where it makes perfectly sense, but I just wanted to underline it should be used with care and by people who knows what and why to do it.

15. @Paul, Regarding JQ's query,

Say I have a cost minimization problem with few constraints. I'm trying to find the upper bound on the dual variable 'L' which is the dual variable to a particular constraint in the primal problem.

Now say my cost minimization objective after solving is COSTMIN. I now change my objective to a maximization and solve again to get COSTMAX.

Is this statement correct: L <= COSTMAX-COSTMIN.

Help will be appreciated

16. @earnest: In general, no. Here's a trivial example (something of a pathological case, but enough to make the point): minimize 2x s.t. x = 5. COSTMIN = COSTMAX = 10, so COSTMAX - COSTMIN = 0. The dual value (L) for the lone constraint is 2.

17. two questions:

1) In regards to earnest's question...if you were to remove the constraint in question when finding COSTMIN and COSTMAX, would the statement L <= COSTMAX-COSTMIN then be correct?

2) Let's say 'y' is the dual vector and you want to find the tightest possible lower and upper bounds, L_i and U_i, for each y_i, such that there exists an optimal dual solution in which L_i <= y_i <= U_i. (Clearly the lower and upper bounds discussed earlier in this post would be valid, but I am attempting to find tighter bounds.)

Can you use the economic interpretation of the duals to do this? What I mean is, if, by looking at constraint i, I can say:

"an increase or decrease in the RHS will change the objective function by no more than some amount, X"

can I then use this analysis to determine the L_i and U_i values? Or is it not guaranteed that there always exists an optimal dual solution such that the dual values are between the lower and upper bounds for the shadow prices as determined by the economic interpretation?

Thanks

18. The answer to your first question is 'no'. Counterexample: minimize 2x + 3y s.t. x + y >= 0.5, 0 <= x, y <= 0.3. COSTMIN = 0, COSTMAX = 1.5 but L = 3.

The answer to your second question, contingent on correct phrasing and one assumption, is 'yes'. The one assumption is that, for each constraint in isolation, it is possible to perturb the right hand side in at least one direction and maintain feasibility of the primal. The adjustment to the phrasing is that the dual value (shadow price) is a _rate_, so "by no more than some amount" needs to become "at no faster than some rate". Given that (and assuming that the primal problem has an optimum), _any_ optimal solution to the dual will satisfy your bounds. The catch is, how do you come up with "economic" bounds that are tighter than what we've already discussed?

19. I'm not sure that _any_ dual optimal solution will satisfy the bounds, but it seems to me that at least one does. Consider the example:

minimize x+10y s.t. x+y = 1, 0 <= x <= 0, 0 <= y <= 1

I see that the right-hand shadow price of constraint x <= 0 is -9. I consider that to be my lower bound on the corresponding dual variable because there is no greater rate at which the objective can improve when increasing the RHS of that constraint.

The dual:

max w1 + w3 s.t. w1 + w2 <= 1, w1+w3 <= 10, and w1, w2 <=0

The dual optimal solution is (10, w2, 0), where w2 <= -9. So in this case, -9 cannot be considered to be a LB on w2 in general, but there exists an optimal solution that satisfies this constraint.

So, it seems as degeneracy plays an interesting role here which is why I'm still a little skeptical as to whether I can use my "economic" bounds in general.

20. I think you meant w2, w3 <= 0 in the dual, but that doesn't impact the discussion here.

The problem lies in the following statement:
"I consider that to be my lower bound on the corresponding dual variable because there is no greater rate at which the objective can improve when increasing the RHS of that constraint." You have to consider decreases as well as increases. If you decrease the bound by any positive amount, the objective value jumps to infinity (the minimum of any empty set). That maps to a dual "value" of negative infinity (subtract epsilon from the upper bound and think of the rate of change as (infinity - 10)/(-epsilon)). So -9 is not a lower bound for the dual variable.

In other words, it's not enough that bound increases cannot drop the objective faster than nine times the increase; to be a bound on the dual variable, it must also be true that decreases in the bound cannot raise the objective faster than nine times the decrease.

21. Yes I did mean w2, w3 <=0, thanks.

You're right, I needed to consider the LH shadow price as well. I appreciate the discussion.

22. Currently, I am studying an assignment problem with a side constraint. Can we determine lower and upper bounds for the dual variable corresponding to the side constraint?

23. I think so, but not exactly as described in the post. Let's assume the assignment problem is a minimization, and that the side constraint looks like a'x >= b (which means the dual of the side constraint has lower bound zero). We can follow the derivation above, but the modified primal (P*) is infeasible (it's an assignment with zero supply, zero demand but a side constraint a'x >= 1), which makes (D*) unbounded. So we do not get a finite upper bound for the dual variable of the side constraint.

Since (D*) is unbounded, there is no finite upper bound on _feasible_ values of the dual variable, but it might still be possible to rule out values above some threshold in an _optimal_ solution. Assume you know a feasible solution x0 to (P) such that a'x0 >= b + 1. Add a constraint to (D*) saying that the objective value of (D) has to be at least the value of the known primal solution (c'x0). That adds a new variable (y) to (P*). I think that our assumption that a'x0 >= b + 1 means that x = x0, y = 1 is feasible in (P*) -- but you'll want to check that. If so, the objective value of that solution in (P*) supplies your upper bound on the dual variable for the side constraint.

1. Thank you for your explanations, I appreciate it.

2. I don't think that x = x0, y = 1 is feasible in (P*) because the solution gives a'x0 + b >= 1.

24. Sorry, I left out one thing. The new constraint in (D*) goes the "unnatural" direction (it's a >= constraint in a max problem). So either you multiply both sides by -1 and make y >= 0 (which is what I did, but forgot to say), or you put it in as is and make y <= 0 (in which case the solution is x = x0, y = -1).

1. Thank you for your comments. I consider a min. assignment problem with a side constraint a'x <= b and want to prove w(l-b)<=z_lp where w, z_lp, and l are a dual variable of the side constraint, the optimal value of the linear relaxation of the assignment problem with the side constrint, and the lower sensitiviy range for the RHS of the side constraint (i.e., l<=b<=u). To prove this, I tried to use your approach, but I am still struggling with it. Could you give me any good ideas? Thank you.

2. You won't be able to prove it because it is not true. Consider a two source, two sink assignment problem with cost matrix c = ((0, M), (M, 0)) for large M. Add the side constraint x11 <= b where 0.5 < b < 1. The optimal solution to the LP is x11 = x22 = b, x12 = x21 = 1 - b, with cost 2M(1-b). The range for b is [0, 1], so l = 0. The dual value of the side constraint is w = -2M. Your inequality resolves to -2M(0-b) <= 2M(1-b); with a little algebra, that ends up b <= 1/2, contradicted by our choice of b.

3. Thank you, I appreciate your help.

4. Then, is this statement, w(l-b)<=z_ip, correct? Here, z_ip is the optimal solution of the assignment problem with the side constraint. Thank you.

5. No. Modify the previous example by setting c11 = c22 = -1 and b = 1. The result is that z_lp = z_ip = -2. The dual has multiple solutions, the extreme cases being w = 0 with l = b and w = -2(M+1) with l = 0; so your choices are 0(1-1) <= -2 or -2(M+1)(0-1) <= -2, neither of which is correct.

25. You are right, but I think a difficulty arises when z_lp < z_ip. What do you think about this case? Thank you.

1. I'm not sure what you mean by "a difficulty". If you mean that your revised inequality might hold when z_lp is strictly less than z_ip, I suppose it might be possible that it holds, although I see no reason why it should. I can find no intuition behind either the original or modified inequality.

2. Thank you for your quick reply. I mean it is difficult for me to prove w(l-b)<=z_ip when z_lp < z_ip. Are there any ideas for this? Your help will be much appreciated.

3. I can't offer any tips, because I do not believe it's true. I do not have a counterexample at hand, but I see no reason to expect it to be true, and the fact that it is false when z_lp = z_ip is not encouraging.

4. I have tried to find a counterexample and tested lots of instances such that z_lp < z_ip, but I couldn't find any instances contradicting w(l-b)<=z_ip.
Thank you. I really appreciate your help.

26. n=5, c_{i,j}=0 if i=j, 100 if not, side constraint sum_i x_{i,i} <= b with b=4.9. z_ip=200, z_lp=10, w=-100, l=1, so your inequality becomes -100(1-4.9) <= 200.

1. Thank you for the example.

If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on OR-Exchange.