OR in an OB World: Minimizing Flow Support (II)

Wednesday, January 29, 2025

Minimizing Flow Support (II)

In yesterday's post I described a graph optimization problem (from a post on OR Stack Exchange) and how to generate random test instances. You are given a digraph with a single commodity flowing through it. There may be multiple supply and demand nodes (with supply and demand in balance), and arcs have neither costs nor capacity limits. The problem is to find a subgraph with all the original nodes and a minimal number of arcs (the "support" of the flow) such that there is a feasible flow pattern to satisfy all demands. In this post, I will discuss a couple of mixed integer programming (MIP) models for the problem.

I will use the following notation. The nodes are $n_1, \dots, n_N$, and the supply or demand at node $n_i$ is given by $s_i,$ where $s_i > 0$ at supply nodes and $s_i < 0$ at demand nodes. The total supply is given by $F.$ The set of arcs is $A,$ and at each node $n_i$ we denote by $\delta^+(n_i)$ and $\delta^-(n_i)$ respectively the set of arcs flowing into and out of $n_i.$ The common elements of my two MIP models are the following.

Variable $x_a \in \lbrace 0,1\rbrace$ is 1 if and only if arc $a\in A$ is selected for the subgraph.
Variable $y_a \in [0, F]$ is the flow volume over arc $a \in A.$
The objective is to minimize the total number of arcs selected:

$$\textrm{minimize }\sum_{a\in A} x_a.$$
For each node $n_i$ we require that flows in and out combined with any supply or demand balance:

$$\sum_{a \in \delta^-(n_i)} y_a - \sum_{a\in \delta^+(a)} y_a = s_i.$$

The models differ in the remaining requirement, that there be no flow on any arcs that were not selected (i.e., $y_a = 0$ if $x_a =0$). Those constraints are added as follows.

In the "big M" model, for each $a\in A$ we add the constraint $y_a \le Fx_a.$
In the "indicators" model, for each $a\in A$ we add an if-then constraint $x_a = 0 \implies y_a = 0.$

I tested both models using two different solvers, IBM CPLEX 22.1.1 and FICO Xpress 9.5 Optimizer (version 44.01.01). Before running a "production" problem I ran all four combinations of model and solver on a small instance using the solvers' respective tuning routines. In three cases, the default solver settings seemed best. In one case (CPLEX on the big M model), the tuner suggested a couple of nondefault parameter settings, but they fared poorly on the production problem. So I used default parameter settings on all the production runs.

The production problem had 25 supply nodes, 34 demand nodes and 41 transit nodes (nodes where supply was 0) and 526 arcs. Total supply was $F=1000.$ I gave each combination of model and solver a one hour time limit (on my slightly vintage PC). I was mainly curious about how the two models would compare, and secondarily on how the two solvers would compare. Of course, running one test instance for one hour per combination is far from probative, but my curiosity has its bounds. Here is what I found.

Solver	Model	Incumbent	Lower bound	Gap (%)
CPLEX	big M	55	49.5	10
CPLEX	indicators	54	48	11
Xpress	big M	57	49	16
Xpress	indicators	58	49	18

There are some differences among combinations in the results (which, again, might not bear up under multiple tests), but what I found a bit interesting was that the gap never made it below 10% in any combination, even though I consider the test problem to be not particularly large. (Also slightly interesting was that only roughly 10% of the arcs in the graph were needed.)

I will note one difference in the solvers. I'm not sure if it is a function of different default parameter settings or different memory management. Within the one hour run time limit, neither attempt with Xpress ran into memory issues. In contrast, one of the CPLEX runs exhausted system memory (and hung the system) before the hour was up. So I did the other CPLEX run with a limit of 9500 MB on the tree size (set via the parameter CPXPARAM_MIP_Limits_TreeMemory), and that run ended due to memory exhaustion before the hour was up. (Both CPLEX runs lasted only about 53 minutes.)

One of the main reasons I ran the tests was to see whether the model with indicators would give be tighter than the big M model. A bit of wisdom I received a long time ago was that big M was better than indicators if you have insights into the model that allow you to use a not terribly large value of $M.$ Here, the worst case would be if the entire flow volume $F$ passed through a single arc, which lets me use $F$ (1000 in the test problem) as my value of $M.$ The big M runs did produce smaller gaps than their indicator counterparts, but not by much, and possibly not by a "statistically significant" amount (if you can even mention "statistical significance" while working with a sample size of 1 🥲).

Still, for now I will stick to preferring big M constraints with not-so-big values of $M$ over indicator constraints.

As mentioned in the previous post, you can find my code here, including a README.md file that explains the code structure.

10 comments:

Rob PrattJanuary 30, 2025 at 9:13 AM
How does Benders decomposition do for both models?
ReplyDelete
Replies
Rob PrattJanuary 31, 2025 at 12:01 PM
The second set of redundant constraints for y is dominated by the flow balance constraints, so I would not expect including those to help. Do you mind providing the .gv file and the supply/demand vector if it is not already part of the file as node attributes?
ReplyDelete
Replies

Add comment

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.

OR in an OB World

Wednesday, January 29, 2025

Minimizing Flow Support (II)

10 comments:

Previous Posts

Labels