Monday, February 10, 2025

CPLEX Drops Documentation

CPLEX Studio 22.1.2 is now available for download, at least on most (possibly not all) supported platforms. The previous version was 22.1.1, and judging from the numbering I assume this is a fairly minor update. That's the good news. The bad news is that the new version no longer installs documentation.

Previous versions created a folder named "doc" in the main folder (parallel to "concert", "cplex", "cpoptimizer" etc.). Under the "doc" folder, if you drilled down a couple of levels, were all sorts of manuals. Particularly important to me were the "refjavacplex" and "refjavacpoptimizer" folders, which contained Javadoc documentation that could be integrated into an IDE (NetBeans in my case) for Java programming. It was also convenient to have the reference manuals installed locally, so that I could bookmark them in my browser and access them even if I was working offline.

Version 22.1.2 does not install the "doc" folder. The reference manuals are still available online if you know where to look (https://www.ibm.com/docs/en/icos/22.1.2), but I do not see a way to make that work as Javadoc in an IDE (and, of course, it is only available when you are online, unless you want to web-scrape it to get a local copy).

I've submitted an "idea" to restore the documentation to the download. If you too want it back, please vote for the suggestion. I would have no objection to it being a separate download, but taking away the Javadoc really seems a bit unfriendly to me.

Meanwhile, I am linking my IDE to the Javadoc for version 22.1.1 and hoping that nothing much has changed.

Wednesday, January 29, 2025

Minimizing Flow Support (II)

In yesterday's post I described a graph optimization problem (from a post on OR Stack Exchange) and how to generate random test instances. You are given a digraph with a single commodity flowing through it. There may be multiple supply and demand nodes (with supply and demand in balance), and arcs have neither costs nor capacity limits. The problem is to find a subgraph with all the original nodes and a minimal number of arcs (the "support" of the flow) such that there is a feasible flow pattern to satisfy all demands. In this post, I will discuss a couple of mixed integer programming (MIP) models for the problem.

I will use the following notation. The nodes are $n_1, \dots, n_N$, and the supply or demand at node $n_i$ is given by $s_i,$ where $s_i > 0$ at supply nodes and $s_i < 0$ at demand nodes. The total supply is given by $F.$ The set of arcs is $A,$ and at each node $n_i$ we denote by $\delta^+(n_i)$ and $\delta^-(n_i)$ respectively the set of arcs flowing into and out of $n_i.$ The common elements of my two MIP models are the following.

  • Variable $x_a \in \lbrace 0,1\rbrace$ is 1 if and only if arc $a\in A$ is selected for the subgraph.
  • Variable $y_a \in [0, F]$ is the flow volume over arc $a \in A.$
  • The objective is to minimize the total number of arcs selected:
       
    $$\textrm{minimize }\sum_{a\in A} x_a.$$
  • For each node $n_i$ we require that flows in and out combined with any supply or demand balance:
       
    $$\sum_{a \in \delta^-(n_i)} y_a - \sum_{a\in \delta^+(a)} y_a = s_i.$$

The models differ in the remaining requirement, that there be no flow on any arcs that were not selected (i.e., $y_a = 0$ if $x_a =0$). Those constraints are added as follows.

  • In the "big M" model, for each $a\in A$ we add the constraint $y_a \le Fx_a.$
  • In the "indicators" model, for each $a\in A$ we add an if-then constraint $x_a = 0 \implies y_a = 0.$

I tested both models using two different solvers, IBM CPLEX 22.1.1 and FICO Xpress 9.5 Optimizer (version 44.01.01). Before running a "production" problem I ran all four combinations of model and solver on a small instance using the solvers' respective tuning routines. In three cases, the default solver settings seemed best. In one case (CPLEX on the big M model), the tuner suggested a couple of nondefault parameter settings, but they fared poorly on the production problem. So I used default parameter settings on all the production runs.

The production problem had 25 supply nodes, 34 demand nodes and 41 transit nodes (nodes where supply was 0) and 526 arcs. Total supply was $F=1000.$ I gave each combination of model and solver a one hour time limit (on my slightly vintage PC). I was mainly curious about how the two models would compare, and secondarily on how the two solvers would compare. Of course, running one test instance for one hour per combination is far from probative, but my curiosity has its bounds. Here is what I found.


Solver Model Incumbent Lower bound Gap (%)
CPLEX big M 55 49.5 10
CPLEX indicators 54 48 11
Xpress big M 57 49 16
Xpress indicators 58 49 18

There are some differences among combinations in the results (which, again, might not bear up under multiple tests), but what I found a bit interesting was that the gap never made it below 10% in any combination, even though I consider the test problem to be not particularly large. (Also slightly interesting was that only roughly 10% of the arcs in the graph were needed.)

I will note one difference in the solvers. I'm not sure if it is a function of different default parameter settings or different memory management. Within the one hour run time limit, neither attempt with Xpress ran into memory issues. In contrast, one of the CPLEX runs exhausted system memory (and hung the system) before the hour was up. So I did the other CPLEX run with a limit of 9500 MB on the tree size (set via the parameter CPXPARAM_MIP_Limits_TreeMemory), and that run ended due to memory exhaustion before the hour was up. (Both CPLEX runs lasted only about 53 minutes.)

One of the main reasons I ran the tests was to see whether the model with indicators would give be tighter than the big M model. A bit of wisdom I received a long time ago was that big M was better than indicators if you have insights into the model that allow you to use a not terribly large value of $M.$ Here, the worst case would be if the entire flow volume $F$ passed through a single arc, which lets me use $F$ (1000 in the test problem) as my value of $M.$ The big M runs did produce smaller gaps than their indicator counterparts, but not by much, and possibly not by a "statistically significant" amount (if you can even mention "statistical significance" while working with a sample size of 1 🥲).

Still, for now I will stick to preferring big M constraints with not-so-big values of $M$ over indicator constraints.

As mentioned in the previous post, you can find my code here, including a README.md file that explains the code structure.

Tuesday, January 28, 2025

Minimizing Flow Support (I)

This is the first of hopefully two posts related to a question posted on Operations Research Stack Exchange. You are given a digraph through which a single commodity flows. Arcs have neither costs nor capacity limits. Each node $n_i$ is either a supply node (with supply $s_i>0$), a demand node (with demand $s_i<0$ treated as a "negative supply” in the optimization models), or what I will call a “transit node” ($s_i =0$) with neither supply nor demand. Key assumptions are that total supply equals total demand and that it is possible to find paths through the digraph satisfying all demands (and thus consuming all supplies).

The problem is to select a minimal number of arcs such that the reduced digraph (using all the original nodes but just the selected arcs) contains routes fulfilling all demands. In this post, I'll describe one way to generate random test instances of the problem. The following post will discuss modeling and solving the problem. I have Java code demonstrating both parts in my university GitLab repository.

My approach to generating a test instance starts with specification of the number of nodes in the graph and a lower bound for the number of arcs. (The lower bound might need to be exceeded in order to ensure that a feasible flow satisfying all demands exists.) My code also asks the user to specify an integer value $F$ for total supply (and total demand). I used integer flows just to make printing the supply/demand at each node and flow on each arc neat. You might prefer to use real valued flows and (since the problem is invariant with respect to the flow volume) just set the total supply/demand at 1.

In what follows, I will use the terms "upstream" and "downstream" to refer to nodes from which there are directed paths to a given node (upstream) or to which there are directed paths from a given node (downstream). To simplify explanation, I will treat demands as positive values. The construction process starts by creating the desired number of nodes and partitioning them into supply, demand and transit nodes. Since you need at least one supply node and at least one demand node, the first node is assigned as a supply node and the second as a demand node. (My code graph also assigns one node as a transit node, but if you do not care whether there are any transit nodes you can skip that.) The remaining nodes are randomly classified as supply, demand or transit. Since my code uses integer flow values, each supply (demand) node needs a supply (demand) of at least 1. If the partitioning process creates more than $F$ supply or demand nodes, the excess nodes are reclassified as transit nodes. If you are using real flow values, this can be skipped.

The next step is to allocate total supply (total demand) randomly across the supply (demand) nodes. Again, since I am using integer flows, my code first allocates one unit of supply or demand to each non-transit node, then allocates the remaining supply (demand) one unit at a time, randomly choosing with replacement a supply (demand) node to receive the next unit.

With the nodes created, it is time to move on to creating arcs. My code allocates to each node of any type two initially empty sets of nodes, those upstream and downstream, and two initially empty sets of arcs, those entering and those leaving the node. It also assigns a temporary variable containing the excess supply if the node is a supply node or the unmet demand if the node is a demand node. Two sets of nodes are created, one containing supply nodes with unused supply (initially, all the supply nodes) and the other containing demand nodes with unmet demands (initially, all the demand nodes).

A list of all possible arcs (excluding arcs from a node to itself) is created and randomly shuffled. Arcs are now added from that list until enough directed paths exist to ensure that all demands can be met. As each arc $(a, b)$ is added to the digraph, it is added to the set of arcs exiting the tail node $a$ and to the set of arcs entering the head node $b.$ The set of nodes upstream of $b$ is updated to include $a$ and all nodes upstream of $a,$ and the set of nodes downstream of $a$ is updated to include $b$ and all nodes downstream of $b.$ Finally, nodes upstream of $b$ with unused supply and nodes downstream of $a$ with unmet demand are paired up. The lesser of the unused supply and unmet demand is subtracted from both the supply of the upstream node and the demand of the downstream node, and whichever has zero supply/demand left is removed from the set of nodes with unused supply/unmet demand. It is possible (and certain when the very last drop of supply/demand is accounted for) that both nodes will be removed from their respective sets.

Once there are no nodes left with excess supply/demand, we can be certain the digraph contains at least one feasible solution. All that remains is to randomly add arcs until the user's specified minimum number of arcs is met.



Friday, January 17, 2025

Mint Madness

I'm a long time, and generally quite content, user of the Linux Mint operating system. For quite a while now, it has had a very nice Software Manager program that lets you install or uninstall programs, as well as an older program named Synaptic that provides more fine-grained capabilities. Synaptic has not been updated in almost two years. To a lot of computer nerds, that means it needs to be replaced. To me, it's a reminder of the first sentence of the "Red-neck Repair Manual": "If it ain't broke, don't fix it."

Today I upgraded Mint from version 22 (Wilma) to version 22.1 (Xia). To my surprise, the upgrade uninstalled Synaptic and did not reinstall it. There was no warning in the Mint 22.1 release notes that this was going to happen (?!). Fortunately, it remains available in a repository and I was able to reinstall it (somewhat ironically, via Software Manager, its ostensible replacement).

I looked at the Mint message boards, and there was a fair bit of confusion and concern about this, along with some relief when users found out they could reinstall it. Looking for a rationale for the removal, all I could find were some vague assertions about there being better alternatives (Software Manager?) than the aged Synaptic, and speculation that the Mint developers were perhaps trying to push users to Software Manager or something new.

Here's the problem with this: Software Manager is not a replacement for Synaptic. It's fine for installing programs, but as best I can tell it is useless for installing libraries. For example, suppose that I want to install a program or library that will mess around with PDFs, and the installer balks because it cannot find the libpoppler library (or cannot find the correct version of it). With Synaptic, I can search "libpoppler" and see which versions I have installed and which are out there but not installed. Odds are a third-party program or library looking for it wants libpoppler-dev, and if I don't already have it, it's a couple of clicks to install it with Synaptic. If I search Software Manager, it won't find what I need. (As of today, at least, it just suggests Ruby-poppler, which it says contains Ruby bindings for libpoppler, as opposed to libpoppler itself.) There's a switch in Software Manager labeled "Search in package descriptions (even slower search)". I tried that and I can confirm that "even slower" was a massive understatement. "Glacial" might be more accurate. The extra time was for naught; it found an R library that uses libpoppler and one other similarly irrelevant hit, but nothing related to installing libpoppler.

I am at a total loss as to why the Mint folks would take away a convenient way to install libraries (not included in programs). There are other ways to install libraries (including directly from the command line), but I am not aware of any as easy to use as Synaptic ... which I fortunately still have.