Tuesday, April 16, 2019

CPLEX Callbacks: ThreadLocal v. Clone

A while back, I wrote a post about the new (at the time) "generic" callbacks in CPLEX, including a brief discussion of adventures with multiple threads. A key element was that, with generic callbacks, IBM was making the user more responsible for thread safety. In that previous post, I explored a few options for doing so, including use of Java's ThreadLocal class. ThreadLocal is my current first choice when writing a generic callback.

Recently, I saw a reply by IBM's Daniel Junglas on a CPLEX user forum that contained the following information.
For each thread CPLEX creates a clone of your callback object. This is done by invoking the callback's clone() method. The Java default implementation of clone() returns a shallow copy of the class.
That set me to wondering about the differences between ThreadLocal and cloning, so I did a bit of experimentation. I'll summarize what I think I learned below.

Why is any of this necessary?


It's not if you only allow CPLEX to use a single thread. When using callbacks with multiple threads, it is possible that two or more threads will access the callback simultaneously. Simultaneously reading/fetching the value of something is harmless, but trying to modify a value simultaneously will either trigger an exception or cause the JVM to crash. (That's not hypothetical, by the way. I have a test program that routinely crashes the JVM.) "Information callbacks" are harmless, but "control callbacks" (where you attempt to alter what CPLEX is doing, by adding cuts or lazy constraints or by taking control of branching), usually involve modifying something. In particular, with Benders decomposition your callback needs to solve a subproblem after first adjusting it based on the proposed master problem solution. Two threads trying to adjust the subproblem at the same time spells disaster.

Is the use of ThreadLocal an option for both legacy and generic callbacks?


Yes. The only tricky part is in initialization of the storage. Let's say that I'm doing Benders decomposition, and I have a subproblem that is an instance of IloCplex. I am going to need to create a separate version of the subproblem for each thread. So my callback class will contain a class field declared something like
private ThreadLocal<IloCplex> subproblem;
and it will need to fill in a value of the subproblem for each thread.

With a generic callback, the ThreadUp context provided by CPLEX can be used to do this. Assuming that context is the argument to the callback function you write, you can use code like the following to initialize the subproblem for each thread.
if (context == IloCplex.Callback.Context.Id.ThreadUp) {
  IloCplex s = ...; // code to generate a new subproblem
  subproblem.set(s);
}
Once the subproblem is initialized, to use it when a candidate solution is being proposed, you need to extract it from the ThreadLocal field. Here is an example of how that would look.
if (context == IloCplex.Callback.Context.Id.Candidate) {
  IloCplex s = subproblem.get(s);
  // Do Benders stuff with s.
}

Legacy callbacks do not have a mechanism like ThreadUp for detecting the creation of a new thread. You can still initialize the subproblem "lazily". In the legacy callback, before using the subproblem check to see if it exists. If not, create it. Here's some sample code.
IloCplex s = subproblem.get();  // get the subproblem
if (s == null) {
  // First use: need to generate a fresh subproblem.
  s = ...; // code to generate a new subproblem
  subproblem.set(s);
}
// Do Benders stuff with s.

Is cloning an option for both legacy and generic callbacks?


No. I don't think cloning can be used with generic callbacks. In Java, objects can be cloned. CPLEX declares legacy callbacks as Java objects, but it declares generic callbacks by means of an interface. Cloning an interface is not really a "thing" in Java.

When you use a generic callback, you create a class that implements the Callback interface. It's certainly possible to make that class cloneable, but according to my experiments CPLEX will not call the clone() method on that class, even if it exists. So the solver will be working with a single instance of that class, and if the class is not thread-safe, kaboom.

What does cloning entail?


There are quite a few web sites that explain cloning in Java, and if you intend to use it I recommend you do a search. I'll just cover the bare bones here.
  1. Declare your class with the modifier "implements Cloneable".
  2. Override the protected method Object.clone.
  3. Call the parent method (super.clone()) as the very first executable line of the clone() method.
  4. Handle the CloneNotSupportedException that the parent method might hypothetically throw, either by catching it and doing something, or by rethrowing it.
  5. After calling the parent method, fix anything that needs fixing.
I left that last step rather vague. With a Benders lazy constraint callback, you will need to replace the original subproblem with a freshly generated version (so that no two threads are chewing on the same subproblem instance). If life were fair (it's not), you would just do a deep clone of the original problem. The catch is that the IloCplex class is not cloneable. So the best (only?) alternative I can see is to build a new copy of the subproblem, the same way you built the original copy.

If you have other class fields that the callback might modify (such as a vector that stores the best feasible solution encountered), you may need to do a deep clone of them (or replace them with new versions). A detailed analysis of the differences between shallow and deep cloning is beyond the scope of this post (or my competence). As Daniel points out in his answer, super.clone() only makes a shallow copy. You'll need to take additional steps to make sure that fields containing arrays and other objects (other than primitives) are handled properly if the callback might modify them.

Here is some skeleton code.

public class MyCallback extends IloCplex.LazyConstraintCallback
                        implements Cloneable {
  private IloCplex subproblem;

  // ...

  /**
   * Clone this callback.
   * @return the clone
   * @throws CloneNotSupportedException if the parent clone method bombs
   */
  @Override
  public MyCallback clone() throws CloneNotSupportedException {
    // Call Object.clone first.
    MyCallback cb = (CloneCallback) super.clone();
    // Replace the subproblem with a fresh copy.
    subproblem = ...;  // generate a fresh copy of the subproblem
    // Make deep copies (or new values) for other fields as needed.
    return cb;
  }

}