## Thursday, August 7, 2014

### Fewer Zeros

A question I saw online not too long ago caused me a flashback to my days teaching linear programming (LP) to masters students. The poster had developed an optimization model -- I can't recall if it was an LP, a quadratic program (QP) or a mixed-integer program (MIP) -- and had no problem solving it. He was, however, unhappy with the number of decision variables taking value zero in the optimal solution, and was quite insistent on being shown how to get another (optimal?) solution containing fewer zeros.

My first reaction took me back even further, to my high school years. If the optimal solution has a bunch of zeros, and it's the unique optimal solution, then that's that. My second recollection was my lecturing on post-optimal sensitivity analysis, and the wonders of multiple optima (which, in the case of LP, you can detect by examining the reduced costs of the variables that did not make it into the final simplex basis). Most people, understandably, are happy to yank the final solution off the printer (or dump it to a file, or whatever) and run with it. I gave two rationales for poking it a bit to see if there were other optima, or perhaps near-optima. Both rationales were illustrated by product mix models (deciding what products to manufacture, in what quantities, given resource constraints, unit profit margins, etc.).
1. Not all "optimal" solutions are created equal. Unless you are dabbling in multiobjective optimization, the model optimizes a single criterion function. There may be other criteria that can be used to distinguish between multiple optimal (or near-optimal) solutions. For instance, does the solution you found require a massive revamping of the production schedule, while another optimal solution lurks that is a minor tweak of the current schedule? Does the optimal solution found end production of the product your boss developed or championed back when, the one that got her promoted to her current position, while another optimal solution continues to produce it?
2. Does your optimal solution kill off some products and, if so, are there other optimal/near-optimal solutions that continue to make them? Zeroing out production of PCs in favor of laptops and tablets may make sense given the current market (as reflected in the current values of demand and revenue parameters in your model), but the marketing weenies may tell you that, should PCs suddenly become hot, you will have a harder time climbing back into the market if you have not retained at least some presence. You may also lose some design or manufacturing competence for a particular product type if you stop making it entirely.
The second rationale is consistent with the original poster's desire for fewer zeros. I'm pretty sure that, given enough time (and enough beers), I could come up with similar rationales in contexts unrelated to production scheduling.

So how does one reduce the number of zeros in the solution? I'm assuming, for simplicity, that you would ideally like none of the variables to be zero; it's easy to tweak what follows to reflect a desire to make some variables positive combined with an indifference to the fate of other variables.

The first question you need to ask yourself is what you are willing to give up in trade, because it's entirely possible you are staring at the unique optimal solution (or that other optima don't do much better as far as number of zeros is concerned). Let's say that your original problem was to maximize $f(x)$ subject to $x\in X$, where $X$ is the original feasible region (defined by various constraints that I don't need to write done here). Let's say further that $x^*\in\Re^n$ is the optimal solution and $f^*=f(x^*)$ is its objective value. You decide that you are willing to sacrifice up to $\epsilon \ge 0$ from the objective value to get a solution with fewer zeros, or maybe up to $\epsilon$ for each variable whose value switches from zero to positive, or something along those lines. (If $\epsilon = 0$, you'd better hope there is an alternate optimum.)

The second question you need to ask yourself is, for each variable $x_i$, what is the minimum value ($L_i > 0$) that you would be willing to accept. Producing 50 truckloads of corn flakes per week combined with one box of wheat flakes is not a whole lot different from just making corn flakes.

The methods discussed below do not exhaust all possibilities; they're just what come to mind at the moment.

#### Method 1: Utopia

We can add the constraint $f(x)\ge f^*-\epsilon$ and pick a new objective function designed to push $x$ toward having fewer zeros. One possibility is to aim for a "utopia point", an idealized solution you likely will never reach. So consider the following problem (which remains an LP if $f()$ is linear, and is quadratically constrained but convex if $f()$ is concave -- anything else and you're on your own).
$\begin{array}{lrcl} \textrm{minimize} & \sum_{i=1}^{n} & w_{i}(y_{i}+z_{i})\\ \textrm{s. t.} & x & \in & X\subset\Re^{n}\\ & f(x) & \ge & f^{*}-\epsilon\\ & x_i - y_i + z_i & = & u_i \ (i\in 1,\dots,n) \\ & y,z & \ge & 0 \\ \end{array}$ The utopia point $u\in \Re^n$ is a vector of ideal values (presumably positive) for the $x$ variables; the weights $w_i > 0$ reflect your priorities for getting close the utopian value $u_i$ for each $x_i$. (Auxiliary variables $y$ and $z$ are used to get the absolute difference between $x$ and $u$.) The objective minimizes $\left\Vert x - u \right\Vert_1$, the $L_1$ norm of the difference; you can use a different norm ($L_0$ stays linear, $L_2$ results in a quadratic program) if you wish.

#### Method 2: Bounds

A second, perhaps simpler, approach is to assert a strictly positive lower bound $L_i > 0$ on every variable $x_i$.
$\begin{array}{lrcl} \textrm{maximize} & f(x)\\ \textrm{s.t.} & x & \in & X\subset\Re^{n}\\ & x & \ge & L\\ \end{array}$There is a danger that this could make the problem infeasible (if you are too aggressive in the choice of $L$). Assessing the trade-off between objective value ($f$) and bounds ($L$) is largely a matter of trial and error. Dual variable values (LP) or KKT multipliers (QP) may allow you to incrementally adjust the bounds until a satisfactory outcome is achieved.

#### Method 3: Binary Variables

Yet another possibility is to introduce binary variables reflecting which original variables are "turned on", and then optimize the selection of those variables. Let $z_i\in\{0,1\}$ take the value 1 if $x_i$ is "sufficiently" positive and 0 otherwise. Suppose that we also have weights $w_i > 0$ reflecting the perceived importance of "turning on" each variable $x_i$. (If you just want as many nonzeros as possible, set $w_i = 1$ for all $i$.) Consider the following model:$\begin{array}{lrcl} \textrm{maximize} & \sum_{i=1}^{n} & w_{i}z_{i}\\ \textrm{s. t.} & x & \in & X\subset\Re^{n}\\ & f(x) & \ge & f^{*}-\epsilon\\ & x_i & \ge & L_i z_i \ (i\in 1,\dots,n) \\ & z & \in & \{0,1\}^n \\ \end{array}$We maximize the aggregate value of the choice of $x$ variables taking (sufficiently) positive value, while maintaining feasibility and taking an acceptable hit in the value of the original objective function. The computational cost is that we have converted our original problem, which might have been a relatively easy to solve LP or QP, into either a mixed-integer linear program or a mixed-integer quadratically constrained program, either of which will be harder to solve.

#### Method 4: Binary Variables Again

A variation of the previous approach is to maximize $f$ subject to a restriction on the number of zeros in the solution:$\begin{array}{lrcl} \textrm{maximize} & f(x)\\ \textrm{s. t.} & x & \in & X\subset\Re^{n}\\ & x_i & \ge & L_i z_i \ (i\in 1,\dots,n) \\ & \sum_{i=1}^{n} w_{i}z_{i} & \ge & K \\ & z & \in & \{0,1\}^n \\ \end{array}$where the weights $w_i$ again indicate relative importance of getting strictly positive values (all 1 if you just want to achieve a certain number of nonzeros) and $K$ is the minimum value/minimum count of nonzeros that you are willing to accept. Once again, assessing the trade-off between objective value and nonzeros is a trial and error process.