Friday, January 18, 2013

Symmetry and The Quest for All Solutions

This is an addendum to Wednesday's post on finding all solutions to linear and (mixed) integer linear programs.

In the context of integer programming models, symmetry refers to situations in which some nontrivial permutation of the indices of variables (and perhaps constraints)  yields the original problem again. The implication is that, given a feasible solution to the problem, applying an appropriate permutation to the variables results in another, nominally distinct, feasible solution with identical objective value. I don't intend to go into symmetry in depth here, but in general terms the implication is that, left untreated, symmetry can slow the progress of branch-and-bound or branch-and-cut algorithms. The reason is that when incumbent solutions are found, the existence of their symmetric counterparts in other portions of the tree (with identical objective value) inhibits pruning of nodes based on bounds.

[Edit: I've added a fourth approach. Hat tip to Ed Klotz for reminding me of it in a comment.]

There are three four fundamental approaches to dealing with symmetry:
  • eliminate it in the model by adding constraints;
  • modify the algorithm to take it into account;
  • do nothing and live with the slower convergence of the algorithm; or
  • find an alternative model with no (or less) symmetry.
The purpose of this post is to point out that, when looking for all or even several optimal solutions to integer problems, the last option is a poor choice.

A quick example


Packing and covering problems are notorious for symmetry, so I'll use a packing problem as an example. Suppose that I am moving my household, using a moving company. The moving company provides boxes, which I will charitably assume are identical in size and shape. They also charge for the boxes, so I wish to use the fewest boxes possible. Being an OR person, I create and solve a bin packing model. The optimal solution puts my gym clothes and my paperback books in box #3 and my fine china and my computer in box #7. (I never said it was a good model.)

Now suppose I (carefully) dump out the contents of boxes 3 and 7 and swap them, packing the china and computer in box #3 and gym clothes and paperbacks in box #7. The boxes are identical, so everything fits. I have not changed the number of boxes being used (the objective function), so this second solution is again optimal.

But is it really a "second solution"? The numbering of the boxes means nothing to me. When I get to my destination, the one making all the clattering sounds will have the computer and what's left of the china, and the one smelling a bit ripe will have the paperback books and the gym clothes (did I mention I forgot to wash them?), regardless of the numbers printed on them.

The reformulation option

 

As Ed Klotz notes in his comment, sometimes a different formulation eliminates the symmetry, possibly (but not necessarily) at the expense of adding variables. For example, I recently was playing with a vehicle routing problem (VRP) in which the vehicles are all identical. The VRP requires assigning customers to vehicles and then organizing each vehicles route (essential a traveling salesman problem for each vehicle). A straight-up formulation might look like the following (in which $d_{ij}$ is the distance or travel cost between customers $i$ and $j$, with $i=0$ for the depot, $c_i$ is the load required by customer $i\in\{1,\dots,N\}$, and $C$ is the capacity of each of the $V$ vehicles:
\[ \begin{array}[t]{lrclr} \mathrm{minimize} & & \sum_{v=1}^{V} \sum_{i=0}^{N}\sum_{j=0}^{N} d_{ij}x_{ijv}\\ \mathrm{s.t.} & \sum_{v=1}^{N}y_{iv} & = & 1 & \forall i>0\\ & \sum_{i=0}^{N}x_{ijv} & = & y_{jv} & \forall j>0,\forall v\\ & \sum_{i=0}^{N}x_{jiv} & = & y_{jv} & \forall j>0,\forall v\\ & \sum_{j=1}^{N}c_{j}y_{jv} & \le & C & \forall v\\ & x_{ijv},y_{iv} & \in & \left\{ 0,1\right\} & \forall i,j>0,\forall v\\ & ? & ? & ? \end{array} \]Binary variables $x_{ijv}$ signal whether stop $j$ (a customer or, if $j=0$, the depot) immediately follows stop $i$ on the route for vehicle $v$, while the binary variables $y_{iv}$ signal whether customer $i$ is assigned to vehicle $v$. The first constraint requires that every customer be assigned to exactly one vehicle. The second and third say that, on any given route, a particular customer is entered and exited exactly once if on that route and zero times if not. The fourth constraint asserts the capacity limit of the vehicles. The final set of constraints (which I deliberately omitted -- the question marks) would be subtour elimination constraints, which can be handled in a variety of ways.

The fact that all vehicles are identical introduces one level of symmetry: if I take a feasible (optimal) solution and rotate the routes among the vehicles, I get an equivalent feasible (optimal) solution. Another source of symmetry is that, within any single route, if I traverse the route in reverse order I get another solution with an identical objective value.

The indistinguishable nature of the vehicles, though, allows us to rethink the formulation in a way that both eliminates the first level of symmetry and also reduces the size of the model. Suppose that, instead of having $V$ vehicles, I had one vehicle that would be making $V$ runs, returning to the "barn" each time. The total number of stops in a day would be $N+V+1$, consisting of starting the day at the depot (1), stopping at each customer ($N$), and returning to the depot $V$ times. So we can think of the entire day's schedule as a single string of integers $0,\dots,N$ with $V+1$ zeros and one instance of each other integer.

With that picture in mind, let $x_{ij}$ be 1 if stop $j$ immediately follows stop $i$ in the master schedule for the day, and let $u_i$ be the load of the vehicle when arriving at stop $i\ge 0$. Assume that $c_0=0$ (the depot has no demand). We arrive at the following alternative model:
\[ \begin{array}[t]{lrclr} \mathrm{minimize} & & \sum_{i=0}^{N}\sum_{j=0}^{N} & d_{ij}x_{ij}\\ \mathrm{s.t.} & \sum_{i=0}^{N}x_{ij} & = & \begin{cases} 1 & j>0\\ V & j=0 \end{cases} & \forall j\ge0\\ & \sum_{i=0}^{N}x_{ji} & = & \begin{cases} 1 & j>0\\ V & j=0 \end{cases} & \forall j\ge0\\ & u_{j}-u_{i} & \le & C-\left(C+c_{i}\right)x_{ij} & \forall i>0,\forall j\ge0\\ & x_{ij} & \in & \left\{ 0,1\right\} & \forall i,j\ge0\\ & u_{j} & \in & [0,C] & \forall j\ge0 \end{array} \] The first two constraints say that every customer is entered and exited once, while the depot is entered and exited exactly once per route. (If the depot occurs in two consecutive stops, it means we have more vehicles than we need.) The third constraint says that the load in the vehicle when entering stop $j$ is no greater than the load entering the previous stop minus the demand at that stop, with the important exception that at the first stop after a trip to the depot, the limit is not enforced (so the maximum load entering the first post-depot stop is $C$, the vehicle capacity).

The model still contains the second source of symmetry -- if we reverse the stops between any two consecutive visits to the depot, the travel distance is unchanged -- but we have eliminated the first source of symmetry. Once we have a solution, we can use a random number generator (or take bribes from the drivers) to assign routes to vehicles.

The reason to find "all" solutions


As I mentioned in the previous post, possible motivations for finding all (or at least multiple) optimal solutions include providing decision makers with choices and dealing with secondary criteria omitted from the model. In either of those cases, I'm fairly certain that providing solutions that are permutations of the original solutions will be unsatisfactory. To quote Spock (in the novel "Spock Must Die!"), "a difference which makes no difference is no difference". So in those cases it behooves you to avoid solutions that are permutations of other solutions. If you take the third approach ("do nothing"), not only will solution time increase, but you will need to screen solutions to detect which are permutations.

If permutations will do ...


On the other hand, if permuted solutions satisfy your lust for multiple solutions, I would still advocate either constraining the symmetry away or modifying the algorithm to avoid (or exploit) it where possible. Doing either of those requires that you recognize the symmetry (which is often something as simple as permuting labels of things). Once you find a proven optimal solution, it is quick and easy to manually apply the permutation(s) to find other "distinct" optima.

12 comments:

  1. I have seen some colleagues swtiching from mixed-integer programming to constraint programming in order to avoid directly dealing with symmetry while working on scheduling problems. I have never had to deal with symmetry directly, but from what I heard it really seems a challenging issue to address in the context of general mixed-integer programming.

    ReplyDelete
    Replies
    1. Marc-Andre: I don't that much (yet) about CP, but I'm pretty sure they have to deal with symmetry in CP too - although it may be simpler to constrain it away in a CP model. I know that I've seen references to symmetry on web sites/blogs devoted to CP. On the MIP side, there's some interesting work being done with "orbital pruning" and "orbital branching", which is what I meant by "modify the algorithm" above.

      Sometimes a few easy constraints added to a MIP model can make a big difference. Years ago I encountered a PhD student doing a model assigning MBA students to teams (something I've worked on myself). She complained about slow solution times. I suggested a single set of constraints that effectively forced team assignments to be made in lexicographically ascending order, and it cut her run time by a factor of 10.

      Delete
  2. Interesting question. My guess is sometimes it helps a lot, sometimes a little, sometimes not at all. The last is not really a guess. I took an example with high symmetry that I've used elsewhere in the blog -- finding a bunch of binary vectors of specified dimension so as to maximize the minimum Hamming distance between them -- and ran it eight ways, asking for up to 100 optimal solutions. The first way uses a bunch of explicit antisymmetry constraints that I added, plus the default setting for SYMMETRY. That found 40 solutions, of which four were optimal (the entire number of optima after adjusting for symmetry) in "0" seconds. The other methods omitted my constraints and used the seven possible settings for SYMMETRY. They all behaved identically, returning 100 optimal solutions (many of which were permutations of each other) in 7 seconds.

    One example does not prove much, other than that the most aggressive SYMMETRY setting is not necessarily a replacement for explicit symmetry-breaking constraints.

    ReplyDelete
  3. Very interesting post, I missed it somehow.
    Symmetry is also a hot topic in constraint programming, a simple search should show papers my various people (including me.)

    To my experience, carefully designed symmetry breaking constraints are the most effective way to improve running time. However, this requires a good understanding of the model where symmetry occurs.

    When it comes to improve a general algorithm like CPLEX MIP, adding constraints doesn't seem to be the best way. Indeed, the additional constraints slow down the LP relaxation solves at each node. Moreover, they may go against the order in which the algorithm searches for solutions.

    The trick is probably to add constraints in very specific cases.

    ReplyDelete
    Replies
    1. Thanks for the comment. When I first tried an online search for "symmetry" in a math programming context, I found far more hits relating to CP than to MP. My suspicion, which your comment supports, is that it is more widely recognized (and studied?) in the CP arena than in MP, although a few people are actively working on exploitation of symmetry in MIP algorithms.

      The types of symmetry-breaking constraints I usually add (such as defending against permutations of "bin" labels in models that resemble assignment or packing models) live in a somewhat gray area between generic and problem-specific, and they do indeed add drag to the node LP solutions. I'd like to find time to try out some of the algorithmic tweaks (orbital pruning or branching), but they come with two caveats. The first is that they're of no help to people who rely on a modeling language/system with a solver interface. Those people will probably have to settle for taking their chances that extra constraints pay for themselves. The second is that I think I would need to take control of the branching process to use orbital this or that, and generally speaking CPLEX seems to be a lot smarter about branching than I am.

      Delete
  4. Regarding the 3 fundamental approaches listed to address symmetry, a fourth approach might qualify as fundamental too. Namely, instead of modifying the existing model with symmetry breaking constraints, reformulate it completely with a alternate formulation that doesn't have the symmetry. This shows up in cutting stock problems as well as shipping problems like the one discussed here where groups of boxes and vehicles may have identical capacities. However, the reformulations, while removing the symmetry, often have many more variables, potentially requiring decompositions like column generation. So, try the first 3 approaches Paul listed first. But, if none of them help, keep this other one in mind.
    Here's a link for more info: http://www-01.ibm.com/support/docview.wss?uid=swg21400016

    ReplyDelete
    Replies
    1. Ed: Thanks for the insightful comment (and link). I'm going to edit the post to include the fourth approach.

      Delete
  5. Hey Paul, thanks for this article. How could you apply these techniques on an enterprise level? I appreciate the moving example, could you provide an example based on a Fortune 500 company? Thanks!

    -James M.

    ReplyDelete
    Replies
    1. James: AFAIK, packing containers (where you have varying sizes of containers, but two containers of the same size are basically indistinguishable) happens in Fortune 500 firms (shipping products).

      Another example is with stock cutting. Suppose you're a paper mill (in a Fortune 500 company :-) ). You produce large rolls of a given type of paper, possibly all of one width or possibly several different widths. Those rolls are slit into smaller rolls that are sold to customers. I think carpet mills behave similarly, but I'm not positive. Given a master roll of a certain width (call it 10 feet for the sake of argument), you can find a bunch of patterns that will slit it into multiple rolls of various retail widths (10", 20", ...) plus likely some trim waste. Choosing the right mix of master roll sizes and patterns to meet your demand with minimal waste can be a nontrivial problem. Symmetry comes in when you have patterns that are basically permutations of each other: 2' - 5' - 3' widths from a 10' roll v. 5' - 3' - 2', etc. That's easily handled (assuming you don't bump into some equipment specific issue) by requiring that patterns be arranged in ascending size order (so 2' - 3' - 5' is the single allowable version of the aforementioned pattern).

      I think symmetry may also be an issue in some vehicle scheduling problems (airlines, railroads, trucking firms) where you might have multiple equally good choices for either equipment or crews for a certain route/load/whatever. It would be harder to spot the actual symmetry in an airline scheduling example (and maybe not worth it?) because there are lots of details that void much of the symmetry. For instance, two seemingly identical aircraft (both available at the same point of origin at the same time) may not be interchangeable if one is getting near its allowed travel limit before required maintenance (and similarly for two crews if one crew is nearer to a mandated limit on operating hours or one has to get home sooner than the other).

      Delete
  6. Hi, Paul...what about the input data based approaches in designing the constraints to resolve symmetry e.g. avoiding symmetry in TSP by analyIng cost matrix? Also, do you know any example where they have been used. Thanks.

    ReplyDelete
    Replies
    1. My answers are, in order, "not familiar with it" and "no". To paraphrase Shakespeare, there are more things in optimization than in my philosophy. :-)

      Delete

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.