A question posted on Stack Overflow (and subsequently deleted) led to a blog post by Erwin Kalvelagen on how to find all paths between two nodes in a directed graph (possibly with self-loops, i.e. arcs from a node to itself) subject to two constraints: no arc can be used more than once in a path; and there is an upper limit $M$ on the number of arcs used. Note that a path might visit a *node* more than once. It just cannot repeat an arc. The original question seems to have referred to an undirected graph, but Erwin's post works with directed graphs and so will I.
Erwin explored some mixed integer linear programming (MIP) models in his post, and a subsequent post on OR Stack Exchange led to more proposals of MIP models (including one from me). I also suggested that a "brute force" approach might be faster than any of the MIP models. In what follows, I will spell out both my MIP model and the brute force approach I used. Java code for both (which requires CPLEX and the Apache Commons Collections library) are in my code repository.
In what follows $A$ is the set of arcs in the graph, $s$ is the origin node for all paths, $t$ is the destination node for all paths, and $M$ is the maximum number of arcs to include in a path.
MIP model
- $u_{a}\in\left\{ 0,1\right\} $ is 1 if and only if arc $a$ is used on the path.
- $f_{a}\in\left\{ 0,1\right\} $ is 1 if and only if arc $a$ is the first arc on the path.
- $\ell_{a}\in\left\{ 0,1\right\} $ is 1 if and only if arc $a$ is the last arc on the path.
- $y_{ab}\in\left\{ 0,1\right\} $ is 1 if and only if arc $b$ immediately follows arc $a$ on the path.
- $z_{a}\in\left[0,M\right]$ will be the number of arcs preceding arc $a$ on the path (0 if $a$ is not on the path).
Some of the variables can be eliminated (fixed at 0) at the outset.
- $f_{a}=0$ if node $s$ is not the tail of arc $a.$
- $\ell_{a}=0$ if node $t$ is not the head of arc $a.$
- $y_{ab}=0$ if the head of arc $a$ is not the tail of arc $b.$
Since we are just interested in finding feasible solutions, the objective is to minimize 0.
Use of the $z$ variables mimics the Miller-Tucker-Zemlin approach to avoiding subtours in the traveling salesman problem. A common knock on the MTZ formulation for the TSP is that it has a somewhat loose continuous relaxation. Since we have a trivial objective (all feasible solutions are optimal), that is not a concern here.
The constraints are as follows.
- There must be one first arc and one last arc.
$$\sum_{a\in A}f_{a}=1.$$$$\sum_{a\in A}\ell_{a}=1.$$ - At most $M$ arcs can be used.$$\sum_{a\in A}u_{a}\le M.$$
- An arc is used if and only if it is either the first arc or follows another arc on the path.$$f_{a}+\sum_{b\in A}y_{ba}=u_{a}\quad\forall a\in A.$$
- The last arc must be a used arc.$$\ell_{a}\le u_{a}\quad\forall a\in A. (1)$$
- The sequence value of an unused arc is 0.$$z_{a}\le Mu_{a}\quad\forall a\in A.$$
- No arc can follow the last arc.$$\ell_{a}+\sum_{b\in A}y_{ab}\le1\quad\forall a\in A. (2)$$
- If an arc is used, either it is the last arc or another arc follows it.$$\ell_{a}+\sum_{b\in A}y_{ab}=u_{a}\quad\forall a\in A. (3)$$
- If an arc $b$ follows arc $a$, the sequence number of arc $b$ must be one higher than the sequence number of arc $a.$ $$z_{a}-z_{b}+\left(M+1\right)y_{ab}\le M\quad\forall a,b\in A,a\neq b.$$
Over on OR Stack Exchange, Rob Pratt correctly pointed out that constraint (3) implies constraints (1) and (2). I've left them in the Java code anyway. The CPLEX presolver removes them automatically.
Finding all solutions
To find all solutions, one possible approach is to solve whatever MIP model you choose, then add a "no-good" constraint that eliminates the solution just found (and only that one) and solve again, until eventually the aggregation of "no-good" constraints makes the model infeasible. What I did instead was to use the "populate" command in CPLEX, which accumulates a pool of solutions. Besides a time limit, I used two non-default parameter settings: I cranked up the solution pool capacity (the maximum number of solutions to find) to something high enough to let it find all solutions; and I set the solution pool intensity to its highest value (4), which tells CPLEX to aggressively seek out all feasible solutions.
No comments:
Post a Comment
Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.