Tuesday, May 17, 2011

My Infinity Is Bigger Than Your Infinity

Mathematicians are used to the concept that some infinities (e.g., $\aleph_0$) are bigger than other infinities (e.g., $\aleph_1$). In mathematical programming, though, "infinity" is really just a placeholder for "I know there is a finite limit but I don't know what it is" (or "I know there is a finite limit but I'm too lazy to look for it"). The only place you'll see $\infty$ in a mathematical program is as an omitted bound on a variable.

There are two other truisms, related to this post, that are well known to anyone with a reasonable amount of experience solving mathematical programs via user-written code (including APIs to solver libraries). The first is that solvers run on double-precision floating point arithmetic, and $\infty$ is not a d-p number. So it is reasonable to treat any d-p number larger than some specified limit $M$ as a surrogate for $\infty$. In the CPLEX APIs, $M=1e+20$. (In the Java version of the Concert API, the static constants Double.MAX_VALUE and Double.POSITIVE_INFINITY work well for $\infty$, while -Double.MAX_VALUE and Double.NEGATIVE_INFINITY work for $-\infty$; but you can just as easily use $\pm 1e20$.) The second truism is that mixing very large coefficients with not-so-large  coefficients in constraints is an invitation to numerical instability.

This brings me to something I tripped over accidentally a day or so ago. Consider the following trivial linear programming problem: $$\begin{array}{lc}\mathrm{maximize} & x \\ \mathrm{s.t.} & x\le M\end{array}$$ where $M>0$ is a large (but finite) number. The problem has an obvious (and trivial) optimal solution ($x=M$) ... unless you are using a solver, and $M$ is large enough to look like $\infty$. Indeed, for $M\ge 1e+20$, CPLEX (using any of the APIs) will (and should) declare the problem unbounded.

The following table shows the results of a little experiment I did (with CPLEX 12.2.0.2, but the results are not version-dependent):

PresolveNo Presolve
$M\lt 1e+10$Optimal ($x=M$)Optimal ($x=M$)
$M= 1e+10$UnboundedOptimal ($x=M$)

Note the result in the lower left corner: the CPLEX presolver asserts that the problem is unbounded. An e-mail exchange with Dr. Ed Klotz of IBM sorted this out for me, and the rationale does in fact make sense. Suppose that the presolver encounters a variable $x$ that it can fix at a large bound $M$. As it eliminates the variable from the model, any instance of $x$ in the model (say, $a_i x$ in the $i^{\mathrm{th}}$ constraint) is replaced by a potentially large number ($a_i M$), which may introduce numerical instability (see second truism above). So CPLEX is somewhat conservative during presolve about how large is too large. In effect, $\infty=1e+20$ in the solver but $\infty=1e+10$ in the presolver.

I discovered this purely by accident, because I'm a firm believer that no model should have coefficients (including bounds) that require me to type two digit exponent fields in scientific notation. If there are values that large, it's time to rescale. It just caught me by surprise that the surrogate for $\infty$ in the presolver was not the same one used in the APIs.

18 comments:

  1. I think all of these little "details" should be listed in the manual. Unfortunately, they usually aren't (not sure if this particular one is). And if they (CPLEX) don't want to do it because there are too many, at least create a web page with those hidden facts that have been discovered so far by their users. It would save people a lot of time and headaches.

    On a separate, unrelated note: last week I just happened to run into your "evil twin" (former student) Bruno Gimenez (the one in that photo on your web site). His wife was one of my former MBA students graduating this Spring. It's a small world after all! :-)

    ReplyDelete
  2. @Tallys: I think this particular fact is undocumented, but in any case there's enough complexity to products like CPLEX that things of this type, even if documented, probably won't register on people reading the documentation (if, in fact, anyone actually does read manuals cover-to-cover anymore). The notion of an indexed, searchable web page of little "gotchas" makes a lot of sense to me.

    Re my "twin", that's an incredible coincidence! Did Bruno tell you the story behind the photo? If you get a chance, please tell him I said 'hi'.

    ReplyDelete
  3. Interesting topic. The real issue is bad software design, yes I said it ! The user should be forced to indicate if a bound is infinite or not by specifying a bound key. So a variable can be BK_FX, BK_LO etc.

    I understand why CPLEX has designed it with numeric bounds, because users are lazy (some also not good programmers :-)), they don't want to specify bound keys. It can be boiled down to the usual "solver should solve anything I throw in"..

    In my own LP interface you can do both, but the recommend way is by using bound keys.

    ReplyDelete
  4. @Bo: From a modeling perspective, I think I agree. I'd rather have a symbol dedicated to infinity, to keep things clear. I don't know that using a bound key would (or should) change the presolver behavior, though. Some users insist on having Really Big Numbers in their models (which are intended to be finite), and the problem of instability from the presolver fixing a variable at a Really Big Bound remains.

    ReplyDelete
  5. @Paul It makes no sense to have numbers higher than say 1.0e+10. From a numerical point the rounding errors can be very large. Most users formulating such models, don't realize these models should often can be formulated as a two stage problem.

    ReplyDelete
  6. @Bo: ... or just scale the variables ...

    ReplyDelete
  7. There are cases were scaling does not help (mostly to users not modelling in a good way).

    ReplyDelete
  8. Most of the models I've seen with large numbers that cannot be eliminated by scaling variables are "big M" formulations where the user has basically no idea for a good value of M. That's where I endorse your two-stage approach (specifically, Benders decomposition).

    ReplyDelete
  9. Exactly, these cases definitely fall into the category I talk about. Other models could be long term discounting, i.e variables are priced with some interest rate over time.

    My point : Don't slam a high M number into your model, but if you do, you should have earned the right to do so first !

    ReplyDelete
  10. Paul, like Tallys, I wish there was more information about this sort of thing in the manual. Thanks for cataloging this little quirk. Your blog post makes a case for why grad students should take a course in scientific computing. Taking such a course in my first semester of grad school was a great investment. It was extremely useful in figuring out why optimization solvers like CPLEX would occasionally give weird answers during my PhD studies (and I never put numbers in my models with two digit exponent fields in scientific notation!)

    ReplyDelete
  11. Laura, thanks for the comment. My first course in grad school was "numerical methods" (which I suspect is similar to "scientific computing" courses today). It both gave me an appreciation for adventures in rounding/truncation and also immediately lifted from my shoulders any worries about maintaining a 4.0 average in grad school. ;-) I'll confess that I'm frequently baffled by questions on tech forums (notably CPLEX forums) about why someone's binary IP model coughed up a variable value of 0.0000001 or 0.9999999. It's not an integer!!!! The part that baffles me is how they ended up using sophisticated software to do numerical optimization without knowing about rounding errors.

    ReplyDelete
  12. Paul: yes, Bruno told me about the story. It all began when he told me he got his MBA at MSU and I said, "Did you take classes with Paul Rubin?". I'll tell him you said hi when I get a chance.

    I also took a numerical methods course, but during my undergrad (computer engineering). Unfortunately, at the time I didn't know about OR yet, so I didn't get as much out of it as I should have.

    ReplyDelete
  13. Across industries, i've yet to see a non-trivial practical case where solid finite bounds couldn't be obtained apriori by a better understanding of the underlying business problem. In general, the upper/lower bounds in a model is our allowance for the real-life "variance" of a decision variable, and this variation is almost always well-understood.

    Beyond that, the default bounds in cplex are for those who shoot first (run the solver first) and then ask questions.

    ReplyDelete
  14. @Shiva: I agree that in real-world (or maybe I should say "realistic", so as not to exclude my academic brethren) problems, it should always be possible to find bounds with modest effort. _Tight_ bounds take more work (but the work frequently pays off). I wish that text books would take the approach of requiring bounds for primal problems.

    Dual problems are a bit different, in that unbounded duals actually make sense in practice (they're nature's way of telling you the primal is overconstrained).

    ReplyDelete
  15. @Shiva: While I agree that solid finite bounds can usually be obtained, CPLEX does a lot of bound strengthening during the presolve phase. So, users don't need to spend a lot of time trying to derive bounds based on the algebraic aspects of the models. But, if they know of bounds that cannot be derived algebraically, then those indeed can improve the numerical characteristics of the model.

    ReplyDelete
  16. @Ed_Klotz:
    "But, if they know of bounds that cannot be derived algebraically, then those indeed can improve the numerical characteristics of the model"

    Quite often, the most important decision variables do tend to have this property in realistic situations because of the business insight available. On the other hand, customers do get impressed when bounds get strengthened due to cascading combinatorial effects (for example. And this is explained properly to them.). This is an example of the reverse case where OR provides some quick insight into the impact of their business decisions.

    ReplyDelete
  17. @Paul: Dual problems are a bit different, in that unbounded duals actually make sense in practice"

    In realistic situations, not always. Problems that are solved using column generation are good examples where key dual variables tend to have a valuable and recognizable physical meaning. This prior knowledge is often exploited to reduce dual bounds from bigM and prune away regions of poor-quality, thereby improving both solution quality (for large/complex instances) and convergence.

    I vaguely recall this topic came up once before in a previous post in this blog in the form of "constraint violation costs" ...

    ReplyDelete
  18. @Shiva: You're right about bounding duals having come up in a prior post. I agree that unbounded dual variables do not always make sense. What I meant was that an unbounded dual variable _sometimes_ makes sense, either because the primal is infeasible or because any change to a primal rhs in a particular direction will cause the primal to become infeasible; whereas I don't think unbounded primal variables make sense in any real-world problem.

    ReplyDelete

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.