Saturday, March 28, 2015

CPLEX Performance Variability

It may come as a surprise to some users of CPLEX that its performance is, somewhat intentionally, random. (This may be true of some other optimization software as well. I'll limit the assertion to CPLEX because CPLEX is the only solver for which I have first-hand knowledge that it's true.)

A certain amount of uncertainty about performance is probably inevitable. Operating systems swap programs in and out of memory and processes in and out of CPU cores in ways uncontrollable by the CPLEX user, so "random" fluctuations in timing are to be expected even when running code with identical parameter settings repeatedly on a single problem on a single machine. When the code is itself multi-threaded, I suspect even more "random" timing issues occur. (CPLEX by default uses multiple threads, but can be restricted to a single thread via a parameter setting.)

Less well known is that CPLEX uses a pseudorandom number stream to control some of its internal decision-making. Marc-André Carle (@ORNinja) has an excellent series of posts on his blog about performance variability, so I'll save myself a ton of typing and refer you there:
I thought I'd share some recent results that demonstrate how one particular source of variability, the random seed, can make a large difference in performance. The name for the seed in the Java API is IloCplex.Param.RandomSeed. According to the CPLEX 12.6.1 parameters manual, its default value changes with each new release of the program, so if you run two different versions of CPLEX on the same problem, you can expect different performance even if none of the changes from one version to the next directly impact your problem. Put another way, your mileage WILL vary.

Here's an example. I took a moderately chewy MIP model that arises in a current research project. I'm not at liberty (nor inclined) to give a full description, but the first few lines of the CPLEX log show that it has what are, by contemporary standards, modest dimensions.

Reduced MIP has 20434 rows, 10239 columns, and 78243 nonzeros.
Reduced MIP has 215 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.08 sec. (37.97 ticks)
Found incumbent of value 216.000000 after 0.23 sec. (89.01 ticks)

This is a cost minimization problem with an integer-valued objective function. The trivial feasible solution with cost 216 is always found immediately; it's from there that the adventure begins. This version of the problem is not too difficult. Although I have not solved it to optimality yet, I'm fairly confident that the optimal solution has cost 11, and I suspect that a few hours of crunching on my quad-core PC would be enough to prove that solution is optimal.

I did a little test, using CPLEX 12.6.1, in which I did 85 independent solves, using different values of the seed parameter. Runs were limited to 60 seconds each and three CPU threads, and the Emphasis.MIP parameter was set to 4 (emphasizing a search for hidden feasible solutions -- improving incumbents -- over moving the lower bound or proving optimality). In other words, I told CPLEX to find good solutions fast, and to worry about proving optimality later. All other CPLEX parameters were left at default.

Here is a histogram of the lower bounds from those 85 runs:
histogram of lower bounds
The bounds ranged from 9.38 to 9.60, and you can see that the vast majority were at the low (inferior) end of the range. Now for a tabulation of the incumbents found within one minute:

Incumbent 11 12 144
Frequency 1 3 81

Recall that I am fairly certain the optimal value is 11. Out of 85 attempts, I landed on the optimal solution once and a near-optimal solution three times; on the other 81 tries, I got a fairly terrible solution (costing more than 13 times the optimum). There are some intermediate solutions with costs in the 20s and 30s that show up in the node logs for the runs where I got lucky; CPLEX just never happened to be sitting on one of those when the time limit expired.

The point here is that, in practice, users don't run the same problem multiple times with different seeds (if they even know the seed parameter exists, which I suspect most do not). This caught my eye because I'd done a test run on this example, gotten the incumbent down from 216 to 144 (with a lower bound under 10) after a minute or so, and thought "hmm, this is going to take a while". I might add that 144 was found fairly quickly, so there was a lengthy stretch with no progress in the incumbent before time ran out. Had I been lucky on that pilot run (where, if the table is to be believed, being lucky has probability a bit under 1 in 20), I would have gotten a really good solution (11 or 12) in the first minute, with a lower bound of 10 (rounding up, since the objective is integer-valued), and thought "hmm, this problem is pretty easy".

So, is the problem easy or not? Beats me.

I'll finish this with a pointer to a slide in a recent present on advances in CPLEX. The slide mentions a "racing ramp-up phase" that can be used when running CPLEX on multiple processors in parallel. In essence, CPLEX tackles the same problem independently on multiple processors, using different parameter settings for each. After a set amount of time (the end of the "ramp-up" phase), the results are compared, a winner is declared, and the processors then all proceed to work on finishing the winner's search tree, using the winner's parameter settings.

With or without parallel processors, a user can replicate this to some extent. In particular, you can do a set of short runs with different seeds, find the best incumbent solution found by any of them, restart with that seed and do a full production run. I don't think there's a way to recycle the winner's search tree (unless you get lucky and the last run was the winner, in which case you can just continue it), but if the ramp-up phase is kept short, this could still help.

Bottom line: a "hard" problem might be truly hard (CPLEX will take a long time no matter seed you use), or it might be hard in the sense that there are (many) more bad seeds than good. Then again, maybe you just need better karma.

No comments:

Post a Comment

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.