In yesterday's post I described a graph optimization problem (from a post on OR Stack Exchange) and how to generate random test instances. You are given a digraph with a single commodity flowing through it. There may be multiple supply and demand nodes (with supply and demand in balance), and arcs have neither costs nor capacity limits. The problem is to find a subgraph with all the original nodes and a minimal number of arcs (the "support" of the flow) such that there is a feasible flow pattern to satisfy all demands. In this post, I will discuss a couple of mixed integer programming (MIP) models for the problem.
I will use the following notation. The nodes are $n_1, \dots, n_N$, and the supply or demand at node $n_i$ is given by $s_i,$ where $s_i > 0$ at supply nodes and $s_i < 0$ at demand nodes. The total supply is given by $F.$ The set of arcs is $A,$ and at each node $n_i$ we denote by $\delta^+(n_i)$ and $\delta^-(n_i)$ respectively the set of arcs flowing into and out of $n_i.$ The common elements of my two MIP models are the following.
- Variable $x_a \in \lbrace 0,1\rbrace$ is 1 if and only if arc $a\in A$ is selected for the subgraph.
- Variable $y_a \in [0, F]$ is the flow volume over arc $a \in A.$
- The objective is to minimize the total number of arcs selected:
$$\textrm{minimize }\sum_{a\in A} x_a.$$ - For each node $n_i$ we require that flows in and out combined with any supply or demand balance:
$$\sum_{a \in \delta^-(n_i)} y_a - \sum_{a\in \delta^+(a)} y_a = s_i.$$
The models differ in the remaining requirement, that there be no flow on any arcs that were not selected (i.e., $y_a = 0$ if $x_a =0$). Those constraints are added as follows.
- In the "big M" model, for each $a\in A$ we add the constraint $y_a \le Fx_a.$
- In the "indicators" model, for each $a\in A$ we add an if-then constraint $x_a = 0 \implies y_a = 0.$
I tested both models using two different solvers, IBM CPLEX 22.1.1 and FICO Xpress 9.5 Optimizer (version 44.01.01). Before running a "production" problem I ran all four combinations of model and solver on a small instance using the solvers' respective tuning routines. In three cases, the default solver settings seemed best. In one case (CPLEX on the big M model), the tuner suggested a couple of nondefault parameter settings, but they fared poorly on the production problem. So I used default parameter settings on all the production runs.
The production problem had 25 supply nodes, 34 demand nodes and 41 transit nodes (nodes where supply was 0) and 526 arcs. Total supply was $F=1000.$ I gave each combination of model and solver a one hour time limit (on my slightly vintage PC). I was mainly curious about how the two models would compare, and secondarily on how the two solvers would compare. Of course, running one test instance for one hour per combination is far from probative, but my curiosity has its bounds. Here is what I found.
Solver | Model | Incumbent | Lower bound | Gap (%) |
---|---|---|---|---|
CPLEX | big M | 55 | 49.5 | 10 |
CPLEX | indicators | 54 | 48 | 11 |
Xpress | big M | 57 | 49 | 16 |
Xpress | indicators | 58 | 49 | 18 |
There are some differences among combinations in the results (which, again, might not bear up under multiple tests), but what I found a bit interesting was that the gap never made it below 10% in any combination, even though I consider the test problem to be not particularly large. (Also slightly interesting was that only roughly 10% of the arcs in the graph were needed.)
I will note one difference in the solvers. I'm not sure if it is a function of different default parameter settings or different memory management. Within the one hour run time limit, neither attempt with Xpress ran into memory issues. In contrast, one of the CPLEX runs exhausted system memory (and hung the system) before the hour was up. So I did the other CPLEX run with a limit of 9500 MB on the tree size (set via the parameter CPXPARAM_MIP_Limits_TreeMemory), and that run ended due to memory exhaustion before the hour was up. (Both CPLEX runs lasted only about 53 minutes.)
One of the main reasons I ran the tests was to see whether the model with indicators would give be tighter than the big M model. A bit of wisdom I received a long time ago was that big M was better than indicators if you have insights into the model that allow you to use a not terribly large value of $M.$ Here, the worst case would be if the entire flow volume $F$ passed through a single arc, which lets me use $F$ (1000 in the test problem) as my value of $M.$ The big M runs did produce smaller gaps than their indicator counterparts, but not by much, and possibly not by a "statistically significant" amount (if you can even mention "statistical significance" while working with a sample size of 1 🥲).
Still, for now I will stick to preferring big M constraints with not-so-big values of $M$ over indicator constraints.
As mentioned in the previous post, you can find my code here, including a README.md file that explains the code structure.
How does Benders decomposition do for both models?
ReplyDeleteWhat decomposition do you have in mind? Master problem picks arcs, subproblem tries to find flows and returns a no-good cut if there are not enough arcs?
DeleteYes, either combinatorial no-good cuts or classical feasibility cuts from the subproblem extreme rays (as in the automated Benders decomposition algorithm provided by some solvers).
DeleteInteresting question. I'll add that to my list of things to try. (Right now I'm investigating redundant constraints.)
DeleteI tried automatic Benders decomposition (arc selection binaries in the master, flow variables in a single subproblem) on a couple of test problems, using CPLEX. I was limited to the big M method; CPLEX cannot decompose the model with if-then constraints. In five minute runs, Benders did worse than the original approach on both incumbent solution and best bound. I also tried adding some redundant constraints to the master (must use at least one exiting arc at each supply node and at least one entering arc at each demand node), but Benders continued to do worse than just plain big M.
Delete