A question on OR Stack Exchange asked about solving an assignment problem "heuristically but to optimality". The problem formulation (in which I stick as closely as possible to the notation in the original post, but substitute symbols for two numeric parameters) is as follows:

\begin{align*} \max_{d_{u,c}} & \sum_{u=1}^{U}\sum_{c=1}^{C}\omega_{u,c}d_{u,c}\\ \text{s.t. } & \sum_{c=1}^{C}d_{u,c}\le C_{\max}\ \forall u\in\left\{ 1,\dots,U\right\} \\ & \sum_{c=1}^{C}d_{u,c}\ge1\ \forall u\in\left\{ 1,\dots,U\right\} \\ & \sum_{u=1}^{U}d_{u,c}\le U_{\max}\ \forall c\in\left\{ 1,\dots,C\right\} \\ & d_{u,c}\in\left\{ 0,1\right\} \ \forall u,c. \end{align*} Here $d_{u,c}$ is a binary variable, representing assignment of "user" $u$ to "service provider" $c$, and everything else is a parameter. Each user must be assigned at least one provider and at most $C_\max$ providers, and each provider can be assigned at most $U_\max$ users. The objective maximizes the aggregate utility of the assignments.

One of the answers to the question asserts that the constraint matrix has the "integrality property", meaning that any basic feasible solution of the LP relaxation will have integer variable values. The recommended solution approach is therefore to solve the LP relaxation, and I agree with that recommendation. (I have not seen a proof that the matrix has the integrality property, but in my experiments the LP solution always was integer-valued.) That said, the author did ask about "heuristic" approaches, which got me wondering if there was a way to solve to optimality without solving an LP (and thus requiring access to an LP solver).

I decided to try Lagrangean relaxation, and it seems to work. In theory, it should work: if the constraint matrix has the integrality property, and the LP relaxation automatically produces an optimal integer-valued solution, then there is no duality gap, so the solution to the Lagrangean problem should be optimal for the original problem. The uncertainty lies more in numerical issues stemming from the solving of the Lagrangean problem.

In what follows, I am going to reverse the middle constraint of the original problem (multiplying both sides by -1) so that all constraints are $\le$ and thus all dual multipliers are nonnegative. If we let $\lambda\ge 0$, $\mu\ge 0$ and $\nu\ge 0$ be the duals for the three sets of constraints, the Lagrangean relaxation is formulated as follows:

$$\min_{\lambda,\mu,\nu\ge0}LR(\lambda,\mu,\nu)=\\\max_{d\in\left\{ 0,1\right\} ^{U\times C}}\left(\sum_{u}\sum_{c}\omega_{u,c}d_{u,c}-\sum_{u}\lambda_{u}\left[\sum_{c}d_{u,c}-C_{\max}\right]\\+\sum_{u}\mu_{u}\left[\sum_{c}d_{u,c}-1\right]-\sum_{c}\nu_{c}\left[\sum_{u}d_{u,c}-U_{\max}\right]\right).$$

We can simplify that a bit:

$$\min_{\lambda,\mu,\nu\ge0}LR(\lambda,\mu,\nu)=\\\max_{d\in\left\{ 0,1\right\} ^{U\times C}}\left(\sum_{u}\sum_{c}\left[\omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c}\right]d_{u,c}\\+C_{\max}\sum\lambda_{u}-\sum_{u}\mu_{u}+U_{\max}\sum_{c}\nu_{c}\right).$$

The inner maximization problem is solvable by inspection. Let $\rho_{u,c}= \omega_{u,c}-\lambda_{u}+\mu_{u}-\nu_{c}$. If $\rho_{u,c} > 0$, $d_{u,c}=1$. If $\rho_{u,c} < 0$, $d_{u,c}=0$. If $\rho_{u,c} = 0$, it does not matter (as far as the inner problem goes) what value we give $d_{u,c}$. So we can rewrite the outer (minimization) problem as follows:

$$\min_{\lambda, \mu, \nu \ge 0}LR(\lambda,\mu,\nu)=\\\sum_{u}\sum_{c}\left(\rho_{u,c}\right)^{+}+C_{\max}\sum\lambda_{u}-\sum_{u}\mu_{u}+U_{\max}\sum_{c}\nu_{c}.$$

$LR(\lambda,\mu,\nu)$ is a piecewise-linear function of its arguments, with directional gradients, but is not continuously differentiable. (Things get a bit tricky when you are on a boundary between linear segments, which corresponds to having $\rho_{u,c}=0$ for one or more combinations of $u$ and $c$.)

I coded a sample instance in R and tested both solving the LP relaxation (using CPLEX) and solving the Lagrangean problem, both using a derivative-based method (a version of the BFGS algorithm) and using a couple of derivative-free algorithms (versions of the Nelder-Mead and Hooke-Jeeves [1] algorithms). Importantly, all three algorithms are modified to allow box constraints, so that we can enforce the sign restriction on the multipliers.

You can download my code in the form of an R notebook, containing text, output and the code itself (which can be extracted). In addition to CPLEX, it uses a gaggle of R libraries: magrittr (for convenience); ompr, ompr.roi, ROI and ROI.plugin.cplex for building the LP model and interfacing with CPLEX; and dfoptim for the Nelder-Mead and Hooke-Jeeves algorithms. (The BFGS algorithm comes via the optim() method, part of the built-in stats library.) If you want to play with the code but do not have CPLEX or some of the libraries, you can just delete the lines that load the missing libraries along with the code that uses them.

Based on limited experimentation, I would say that Nelder-Mead did not work well enough to consider, and BFGS did well in some cases but produced somewhat suboptimal results in others. It may be that tweaking some control setting would have helped with the cases where BFGS ran into trouble. Hooke-Jeeves, again in limited testing, consistently matched the LP solution. So if I needed to come up with some hand-coded way to solve the problem without using libraries (and did not want to write my own simplex code), I would seriously consider using Hooke-Jeeves (which I believe is pretty easy to code) on the Lagrangean problem.

[1] Hooke, Robert and Jeeves, T. (1961) "Direct search'' solution of numerical and statistical problems. *Journal of the ACM*, Vol. 8, No. 2, 212-229.

Can you elaborate on how to use BFGS for piecewise linear functions? I always got the impression that BFGS requires a sufficiently smooth objective function. With piecewise linear I don't see how a termination rule that relies on "norm of the gradient sufficiently small" can work reliably on a piecewise linear function where even at the optimal solution we might get a subgradient != 0.

ReplyDeleteOr are you using BFGS in a heuristic fashion?

Of course one could use BFGS als

It seems your comment got cut off, unfortunately. Anyway, I was just exploring, seeing which methods seemed to work (or not). With BFGS and the optim() function, I chose the option to approximate gradients via finite difference. If you think about using gradient descent with $f(x)=|x|$ in one dimension, finite difference estimates of the gradient will be +/-1 when you are away from $x=0$, but near $x=0$ the slope estimates (assuming they are computed over a small interval centered at the current point) will be between -1 and +1, and nearly zero as you get close to $x=0$.

DeleteThanks, I know want finite differences are and how to get an exact subgradient for LR() if necessary ;)

DeleteI rephrase my question: Are there any proofs that bfgs converges to an optimal solution on piecewise linear functions/lagrangian relaxations? If so my life would become easier...

Good and free bfgs implementations are much easier to find than libraries for convex non-smmooth optimization (afaik Antonio Frangioni's ndosolver is more or less the only one). Of course you can always write a quick&dirty subgradient method that may or may not work well enough...

Sorry, if there is a proof that BGFS converges on (convex) pw-linear functions, I don't know of it. That does not mean much, as I have not messed with nonlinear programming in the last 30 years or so. If you have not already done so, you might ask on OR Stack Exchange or Mathematics Stack Exchange.

Delete