## Friday, April 27, 2018

### Big M and Integrality Tolerance

A change I made to an answer I posted on OR-Exchange, based on a comment from a well-informed user of OR-X, might be worth repeating here on the blog. It has to do with issues that can occur when using "big M" type integer programming models, a topic I've covered here before.

As I mentioned in that previous post, part of the problem in "big M" formulations stems from the inevitable rounding error in any non-integer computations done on a computer. A particular manifestation of rounding error (regardless of whether there are "big M" coefficients in the model or not) is that the double precision value assigned to integer variables in a solution will not necessarily be integers. With surprising frequency, I see users of MIP software demanding to know why the answer they got for their integer variable was 0.9999999999975 or 1.000000000032 rather than exactly 1. The answer has two parts: (a) rounding error is pretty much inevitable; and (b) the software designers accepted that reality and decreed that anything "close enough" to the nearest integer counts as being an integer for purposes of deciding if a solution is feasible. (Why the software prints all those decimal places rather than rounding for you is a separate question that I will not address.)

So the solver generally has a parameter that gives an "integrality tolerance", just as it has a (typically separate) parameter for how close the expression in a constraint has to be to the allowed value(s) to be considered feasible (again, a nod to rounding error). In CPLEX, the name of the integrality tolerance parameter is some variant of "MIP.Tolerances.Integrality" (in earlier versions, the much more compact "EpInt"), and its default value (as of this writing) is 1.0E-5.

So now I'll connect that to "big M". One of the more common uses of "big M" is to capture a logical constraint of the form "if condition is true then expression is limited". For instance, you might want to build into the model that if a warehouse is not open (the condition) then shipments from it must equal (or cannot exceed) zero. Algebraically, this frequently appears as $$f(x) \le My$$where $f(x)$ is some (hopefully linear) expression involving variable $x$ and $y$ is a binary variable (with 1 meaning the constraint is relaxed and 0 meaning it is enforced). In a world of exact arithmetic, and with $M$ chosen large enough, $y=1$ means the value of $f(x)$ is essentially unbounded above, while $y=0$ means $f(x)\le 0$.

Here the issue of integrality tolerance sneaks in. Suppose that we choose some really large value of $M$, say $M=1E+10$, and that the solver decides to accept a solution where $y=1.0E-6$ (which is within CPLEX's default integrality tolerance). From the solver's perspective, $y=0$. Logically, that should mean $f(x)\le 0$, but given the rounding error in $y$ and the large value of $M$ what you actually get is $f(x)\le 10,000$. So, borrowing from my earlier example, I've got a closed warehouse shipping 10,000 units of whatever. Oops.

A common reaction to this (and by "common" I mean I've seen it multiple times on help forums) is to say "I'll set the integrality tolerance to zero". Good luck with that. First, it's not guaranteed the software will let you. (CPLEX will.) Second, if it does let you do it, you might get a suboptimal solution, or be told your perfectly feasible problem is actually infeasible, because the software couldn't get the true optimal solution (or perhaps any solution) to have zero rounding error in all the integer variables.

If you run into incorrect solutions in a "big M" model, some combination of tightening the integrality tolerance (but not all the way to zero) and ratcheting down the size of $M$ may fix things ... but, as with all aspects of MIP models, there are no guarantees.

1. You should mention that it is possible to solve MIPs in exact arithmetic. It is not very fast (and often it almost takes an infinite amount of time or memory), but for some cases, it might be a viable solution.

Some time ago, there were several very promising talks about an exact version of SCIP using iterative refinement, but unfortunately, it seems to have disappeared.

Myself, I played around with exact arithmetic a while ago. I coded a LP-/MIP-solver that is completely based on exact data types like the GNU GMP. It can solve some problems and has partly been added to polymake (the integer part only partly in the latest development branch). If somebody ever reads this post and wants to give it a try, just drop me an email or contact me through the polymake forum. I'll be happy to share the code and give instructions how to use it.

Best regards,
Thomas

1. I've seen some discussions of using exact arithmetic, typically assuming that the coefficients are integer (or at least rational). Of course, every coefficient is "rational" when you type it (since you only type a few decimal places), but it's unclear to me whether using exact arithmetic with approximate coefficients (and no rounding tolerance) leads to the same, better or worse solutions compared to what you get with double precision arithmetic.

Thanks for offering to help readers with your contribution to polymake! I remember hearing something about an exact version of SCIP but did not pay close attention (and did not realize it never came to fruition).

2. You are right, data has to be accurate. Concerning my solver, I have written an LP file reader that can read arbitraty large floats and also fractions, e.g. 1234/5678.

At least, an exact solver will (if it ever terminates) lead to results that solve the given problem exactly. If the problem is given inexactly, then of course, an exact solution is not really helpful. Using exact arithmetic makes sense if you want to prove something or need exact results to exact data. In any other case, one should use fast inexact solvers.

My solver can in principle use any exact C++ data type. For all my tests, I use GMP rationals (mpq_class), but polymake uses an extension to that so that up to some degree, irrational data can be used. But in my opinion, irrational data only makes sense for pure linear programs, as some of the IP theory is based on rational numbers. Especially, it might be difficult to determine if some MIP is infeasible or unbounded.

Best regards,
Thomas

3. Well, if you are doing a model for the government (any government), I suspect you are stuck with "irrational data". ;-) There are certainly problems which come "naturally" with rational (if not integer) coefficients, and for those an exact solver would make good sense (provided it performed adequately).

2. I think the problem is not adequacy but runtime. If the whole solver is based on exact data types and all operations are performed without tolerances and the code itself is correct, then the results should be fine.

Another question: Do you know of some "general" irrational C++ data type similar to the rational mpq_class from GMP? With polymake's field extension, one might already be able to solve some irrational problems. But I have to admit that I don't know how general this data type really is.

By the way: If somebody has some numerically troublesome MIP instances of moderate size (a few dozens of rows and columns), I would be very happy if he/she could provide a copy as (exact) LP file.

Best regards,
Thomas

1. By "adequately" I meant reasonable run time (with reasonable memory requirements). As for your question, sorry, but no. I haven't touched C++ in a decade or two (or more), and when I was last cursed with using it all I knew were the usual primitive data types.

If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on OR-Exchange.

Due to a particularly persistent spammer, comments are temporarily (?) being moderated.