Sunday, July 30, 2023

Visiting All Nodes

A question on Operations Research Stack Exchange asks about finding a shortest route through a connected graph that visits every node. This is distinct from the well-known traveling salesman problem (TSP), in that the TSP requires each node to be visited exactly once whereas the OR SE question explicitly allows nodes to be visited more than once.

The question reminded me of a conversation I had years ago with a doctoral student in our logistics program. He was working on routing deliveries (via truck) from a depot to various customers, and was trying to formulate it as a TSP using a graph based on a road map (buildings or intersections as nodes, roads as edges). I did my best at the time to convince him that the TSP was unnecessarily restrictive and might even make the problem infeasible. The second example in the OR SE question is a nice example of how that could happen. Generally speaking, in an actual routing problem there is nothing stopping you from passing by/through a previously visited location if that's the best route. There are a few exceptions to this, some in military applications (you were laying mines and/or blowing up bridges as you passed them) and some in normal life (you were speeding and might not want to return some place where a traffic cop would remember you), but for the most part I believe it holds.

My advice to the doctoral student was similar to the answer Rob Pratt posted on OR SE. Rethink the graph, keeping just the depot and customers as nodes and letting the edges in the new graph represent the shortest routes between pairs of nodes. So edge $(i,j)$ and edge $(k,\ell)$ connect different pairs of nodes but might represent physical routes that cross (you pass through the same intersection on both, albeit possibly in different directions) or overlap (both take US 127 north from mile marker 102 to mile marker 137). Building the edges in the new graph involves solving a shortest path problem between each pair of nodes (using the original graph). You can apply Dijkstra's algorithm to each pair of nodes, but I'm pretty sure applying the Floyd-Warshall algorithm would be faster. Pro tip: if you use Floyd-Warshall, don't bother to extract the route for each pair of nodes. Just get the distances, and keep the F-W solution handy. Once you have solved a TSP on the modified graph and know which edges will be used, you can extract the physical routes corresponding to just the used edges from the F-W solution.

I posted an explicit MIP model (which I will not repeat here, since it's a space hog) as an answer to the OR SE question. I suspect that Rob's approach (TSP on a modified graph) might be faster, particularly because it allows you to use a dedicated TSP solver. Although TSPs have a reputation for being difficult (being the poster child for NP-hard problems), bespoke solvers like Concorde can chew through rather large TSPs pretty quickly. Other than the upfront computational cost of finding the shortest path between every pair of nodes, the one concern I might have with the TSP approach is that it converts what might have been a sparse graph to a complete graph. So, like all things MIP, which approach is better is an empirical question.