Saturday, July 20, 2019

Using Java Collections with CPLEX

Disclaimer: What follows is specific to Java, but with some name changes will also apply to C++. If you are using one of the other programming APIs for CPLEX, something analogous may exist, but I would have no idea about it.

I've seen a few questions by CPLEX users on forums recently that suggest the questioners may be moving from using a modeling language for optimization to using a general purpose language. Modeling languages tend to be more expressive (at least in my opinion) when it comes to writing optimization models. That's hardly shocking given that they are built for that purpose (and general programming languages are not). Recent questions have been along the lines of the following: I was using a structure (a tuple, or something similar) supported by the modeling language, and I don't know how to translate that to the general language; or I was creating arrays of variables / constraints / whatever indexed by something human-readable (probably names of things) and I don't know how to switch to integer-indexed arrays and keep it readable.

In my mind, the answer to those issues in Java is to make good use of classes and the Java Collection interface (and its various implementations). I'll give some examples in the context of a network flow problem.

In a modeling language, I might use a tuple to represent an arc. The tuple would consist of some identifier (string? index?) for the tail node of the arc, some identifier for the head node of the arc, and maybe an arc weight or some other arc property. In Java, I would create a class named Node to represent a single node, with fields including the node's name or other unique identifier and any costs, weights or other parameters associated with the node. Then I would create a class named Arc with fields of type Node holding the tail and head nodes, along with fields for any parameters associated with the arc (such as cost), and maybe a string field with a human-friendly label for the arc.

Now suppose I have an instance of IloCplex named cplex, and that I need a variable for each arc representing the flow on the arc. The conventional (?) approach would be to create a one dimensional array of IloNumVar instances, one per arc, and park the variables there. In fact, CPLEX has convenience methods for creating vectors of variables. The catch is that it may be hard to remember which index corresponds to which arc. One partial answer is to attach a label to each variable. The various methods in the CPLEX API for creating variables (and constraints) all have versions with a final string argument assigning a name to that variable or constraint. This works fine when you print out the model, but it's not terribly helpful while you're doing the coding.

I pretty much always use that feature to assign labels to variables and constraints, but I also employ collections in various ways. In my hypothetical network flow example, I would do something like the following.

HashSet<Node> nodes = new HashSet<>();
// Fill nodes with all node instances.
HashSet<Arc> arcs = new HashSet<>();
// Fill arcs with all arc instances.
HashMap<Arc, IloNumVar> flowVars = new HashMap<>();
HashMap<IloNumVar, Arc> flowArcs = new HashMap<>();
for (Arc a : arcs) {
  IloNumVar x = cplex.numVar(0, Double.MAX_VALUE, "Flow_" + a.getID());
  flowVars.put(a, x);
  flowArcs.put(x, a);

I put the nodes and arcs in separate collections (I used sets here, but you could equally well use lists), and then I create mappings between model constructs (in this case, arcs) and CPLEX constructs (in this case, variables). Collections can be iterated over, so there is no need to mess around with going back and forth between model constructs and integer indices. For each arc, I create a variable (and give it a name that includes the string identifier of the arc -- I'm assuming here that I gave the Arc class a getID method). I then map the arc to the variable and the variable to the arc. Why two maps? The arc to variable map lets me grab the correct variable while I'm building the model. For instance, a flow balance constraint would involve the flow variables for all arcs leading into and out of a node. I would identify all those arcs, park them in a collection (list or set), iterate over that collection of arcs, and for use the flowVars map to get the correct variables.

The reverse mapping I set up comes into play when the solver has a solution I want to inspect. I can fetch the value of each variable and map it to the corresponding variable, along the following lines.

HashMap<IloNumVar, Double> solution = new HashMap<>();
for (IloNumVar x : flowArcs.keySet()) {
  solution.put(x, cplex.getValue(x));

When I need to associate variable values with arcs, I can iterate over solution.entrySet(). For each entry e, flowArcs.get(e.getKey()) gives me the arc and e.getValue() gives me the flow on the arc.

Similar things can be done with constraints (IloRange instances).