## Saturday, August 25, 2018

### Adding Items to a Sequence

A question posed on OR-Exchange in 2017 asked the following: Given a tour of nodes, how does one best add two new nodes while respecting the ordering of the original tour. Specifically, the author began with a tour 0 - 1 - 2 - 4 - 6 - 0 (where node 0 is a depot) and wanted to add new stops 3 and 5 in such away that, in the revised tour, stop 1 still came before stop 2, stop 2 before stop 4, etc.

This problem can arise not just in vehicle routing but in many sorts of sequencing problems (such as scheduling jobs for production). Of course, preserving the original ordering to the extent possible is not always a concern, but it might be if, for instance, the existing stops are customers who have been promised somewhat general time windows for delivery. In any event, we'll just take the question as a given.

The answer I posted on OR-X made the somewhat charitable (and, in hindsight, unwarranted) assumption that the two new stops would be inserted by breaking two previous arcs, rather than consecutively (for instance, ... - 2 - 3 - 5 - 4 - ...). So I'll post an answer without that assumption here. In fact, I'll post three variants, one specific to the case of adding exactly two stops and the other two more general.

First, let me articulate some common elements. I'll denote the set of original nodes by $N_1$, the set of nodes to be added by $N_2$, and their union by $N=N_1 \cup N_2$. All three approaches will involve setting up integer programming models that will look for the most part like familiar routing models. So we will have binary variables $x_{ij}$ that will take the value 1 if $j$ immediately follows $i$ in the new tour. We will have constraints ensuring that every node is entered and exited exactly once:$$\sum_{j\in N} x_{ij} = 1\quad \forall i\in N\\ \sum_{i \in N} x_{ij} = 1 \quad\forall j\in N.$$The objective function will be some linear combination of the variables (sum of distances covered, sum of travel times, ...), which I will not worry about here, since it is no different from any sequencing model.

The first new wrinkle is that we do not define a variable for every pair of nodes. We create $x_{ij}$ only for the following combinations of subscripts: \begin{align*} i & \in N_{2},j\in N_{2},i\neq j\\ i & \in N_{1},j\in N_{2}\\ i & \in N_{2},j\in N_{1}\\ i & \in N_{1},j\in N_{1},(i,j)\in T \end{align*} where $T$ is the original tour. Thus, for example, we would have $x_{24}$ but not $x_{42}$, nor $x_{26}$. The rationale is straightforward: if we add an arc between two original nodes that were not successors on the original tour, we will force an order reversal. For instance, suppose we replace the arc 2 - 4 with, say, 2 - 6. Node 4 now must appear either before node 2 or after node 6, and either way the order has not been preserved.

### Version 1

The first variant makes explicit use of the fact that we have only two new nodes. We add one subtour elimination constraint, to prevent the new nodes from forming a subtour: $x_{35}+x_{53}\le 1.$ Now consider how many different ways we could insert the two new nodes. First, we could break two links in the original tour, inserting 3 in the void where the first link was and 5 in the void where the second link was. Since the original tour had five links there are $\binom{5}{2}=10$ distinct ways to do this. Similarly, we could break two links but insert 5 first and 3 later. There are again ten ways to do it. Finally, we could break one link and insert either 3 - 5 or 5 - 3 into the void. With five choices of the link to break and two possible orders, we get another ten results, for a grand total of 30 possibly new tours.

With that in mind, consider what happens if node 3 is inserted after original node $i$, breaking the link between $i$ and its original successor $j$. (In our model, this corresponds to $x_{i3}=1$.) If this is a single node insertion, then we should have $j$ follow node 3 ($x_{3j}=1$). If it is a double insertion ($i$ - 3 - 5 - $j$), we should have $x_{35}=x_{5j}=1$. We can capture that logic with a pair of constraints for each original arc:
\left.\begin{aligned}x_{i3}-x_{3j} & \le x_{35}\\ x_{i3}-x_{3j} & \le x_{5j} \end{aligned} \right\} \forall(i,j)\in T. We could do the same using node 5 in place of node 3, but it is unnecessary. If node 3 is correctly inserted by itself, say between $i$ and $j$, and node 5 is inserted after original node $h$, then the original successor $k$ of $h$ needs a new predecessor. That predecessor cannot be $h$, nor can it be any other original node (given our reduced set of variables), nor can it be node 3 (which now precedes $j$). The only available predecessor is 5, giving us $h$ - 5 - $k$ as expected.

You might wonder how this accommodates a 5 - 3 insertion, say after node $i$. The original successor $j$ of $i$ needs a new predecessor, and 3 is the only eligible choice, so we're good.

I tested this with a small Java program, and it did in fact find all 30 valid revised tours (and no invalid ones).

### Version 2

Version 2, which can be applied to scenarios with any number of new nodes, involves building a standard sequencing model with subtour elimination constraints. The only novel element is the reduced set of variables (as described above). A blog is no place to explain sequencing models in their full glory, so I'll just assume that you, the poor suffering reader, already know how.

### Version 3

In version 3, we again build a sequencing model with the reduced set of variables, but this time we use the Miller-Tucker-Zemlin method of eliminating subtours rather than adding a gaggle of subtour elimination constraints. The MTZ approach generally results in smaller models (since the number of subtours, and hence the potential number of subtour constraints, grows combinatorially with the number of nodes), but also generally produces weaker relaxations.

The Wikipedia page for the TSP shows the MTZ constraints, although for some reason without labeling them as such. Assume a total of $n$ nodes (with consecutive indices), with node $0$ being the depot. The MTZ approach adds continuous variables $u_i, \,i\in \{1,\dots,n\}$ with bounds $0\le u_i \le n-1$. It also adds the following constraints for all eligible arcs $(i,j)$ with $i\neq 0$:$$u_i - u_j + n x_{ij} \le n-1.$$You can think of the $u_i$ variables as counters. The MTZ constraints say that if we go from any node $i$ (other than the depot) to any node $j$ (including the depot), the count at node $j$ has to be at least one larger than the count at node $i$. These constraints preclude any subtours, since a subtour (one starting and ending any place other than the depot) would result in the count at the first node of the subtour being larger than itself.

As I mentioned, the MTZ formulation has a somewhat weaker LP relaxation than a formulation with explicit subtour elimination constraints, so it is not favored by everyone. In our particular circumstance, however, it has an additional virtue: it gives us a relatively painless way to enforce the order preservation requirement. All we need do is insert constraints of the form$$u_j \ge u_i + 1\quad\forall (i,j)\in T.$$This forces the counts at the original nodes to increase monotonically with the original tour order, without directly impacting the counts at the new nodes.