How to handle a piecewise-linear function of one variable in an optimization
model depends to a large extent on whether it represents an economy
of scale, a diseconomy of scale, or neither. The concepts apply generally,
but in this post I'll stick to cost functions (where smaller is better,
unless you are the federal government) for examples; application to
things like productivity measures (where more is better) requires
some obvious and straightforward adjustments. Also, I'll stick to
continuous functions today.
Let's start with a diseconomy of scale. On the cost side, this is something that grows progressively more expensive at the margin as the argument increases. Wage-based labor is a good example (where, say, time-and-a-half overtime kicks in beyond some point, and double-time beyond some other point). A general representation for such a function is \[f(x)=a_{i}+b_{i}x\mbox{ for }x\in[d_{i-1},d_{i}],\: i\in\{1,\ldots,N\}\] where $d_{0}=-\infty$ and $d_{N}=+\infty$ are allowed and where the following conditions must be met: \begin{eqnarray*}d_{i-1} & < & d_{i}\:\forall i\in\{1,\ldots,N\}\\b_{i-1} & < & b_{i}\:\forall i\in\{2,\ldots,N\}\\a_{i-1}+b_{i-1}d_{i-1} & = & a_{i}+b_{i}d_{i-1}\:\forall i\in\{2,\ldots,N\}.\end{eqnarray*}The first condition just enforces consistent indexing of the intervals, the second condition enforces the diseconomy of scale, and the last condition enforces continuity. The diseconomy of scale ensures that $f$ is a convex function, as illustrated in the following graph.
Letting $f_{i}(x)=a_{i}+b_{i}x$, we can see from the graph that $f$ is just the upper envelope of the set of functions $\{f_{1},\ldots,f_{N}\}$ (easily verified algebraically). To incorporate $f(x)$ in a mathematical program, we introduce a new variable $z$ to act as a surrogate for $f(x)$, and add the constraints\[z\ge a_{i}+b_{i}x\;\forall i\in\{1,\ldots,N\}\] to the model. Anywhere we would use $f(x)$, we use $z$ instead. Typically $f(x)$ is either directly minimized in the objective or indirectly minimized (due to its influence on other objective terms),
and so the optimal solution will contain the minimum possible value of $z$ consistent with the added constraints. In other words, one of the constraints defining $z$ will be binding. On rare occasions this will not hold (for instance, cost is relevant only in satisfying a budget, and the budget limit is quite liberal). In such circumstances, the optimal solution may have $z>f(x)$, but that will not cause an incorrect choice of $x$, and the value of $z$ can be manually adjusted
to $f(x)$ after the fact.
Now let's consider an economy of scale, such as might occur when $x$ is a quantity of materials purchases, $f(x)$ is the total purchase cost, and the vendor offers discounts for larger purchases. The marginal rate of change of $f$ is now nonincreasing as a function of $x$, and as a result $f$ is concave, as in the next graph.
Now $f$ is the lower envelope of $\{f_{1},\ldots,f_{N}\}$. We define the segments as before, save with the middle condition reversed to $b_{i-1}>b_{i}\:\forall i\in\{2,\dots,N\}$. We cannot simply add constraints of the form \[z\le a_{i}+b_{i}x\] if we are (directly or indirectly) minimizing $z$, since $z=-\infty$ would be optimal (or, if we asserted a finite lower bound for $z$, the lower bound would automatically be optimal). In the presence of an economy of scale, we are forced to add binary variables to the formulation, either directly or indirectly. The easiest way to do this is to use a type 2 specially ordered set (SOS2) constraint, provided that the solver (and any modeling language/system) understand SOS2, and provided that $d_{0}$ and $d_{N}$ are finite. Introduce new (continuous) variables $\lambda_{0},\ldots,\lambda_{N}\ge0$ along with $z$, and introduce the following constraints:\begin{eqnarray*}\lambda_{0}+\cdots+\lambda_{N} & = & 1 \\ \lambda_{0}d_{0}+\cdots+\lambda_{N}d_{N} & = & x\\ \lambda_{0}f(d_{0})+\cdots+\lambda_{N}f(d_{N}) & = & z.\end{eqnarray*} Also communicate to the solver that $\{\lambda_{0},\ldots,\lambda_{N}\}$ is an SOS2 set. The SOS2 constraint says that at most two of the $\lambda$ variables can be nonzero, and that if two are nonzero they must be consecutive. Suppose that $\lambda_{i}$ and $\lambda_{i+1}$ are positive and all others are zero. Then the new constraints express $x$ as a convex combination of $d_{i}$ and $d_{i+1}$ (with weights $\lambda_{i}$ and $\lambda_{i+1}=1-\lambda_{i}$), and express $z=f(x)$ as a convex combination of $f(d_{i})$ and $f(d_{i+1}$) using the
same weights.
If SOS2 constraints are not an option, then binary variables have to be added explicitly. Let $\delta_{i}\in\{0,1\}$ designate whether $x$ belongs to the interval $[d_{i-1},d_{i}]$ ($\delta_{i}=1$) or not ($\delta_{i}=0$). Add the following constraints:\begin{eqnarray*}\delta_{1}+\cdots+\delta_{N} & = & 1\\ x & \le & d_{i}+M(1-\delta_{i})\:\forall i\in\{1,\ldots,N\}\\ x & \ge & d_{i-1}-M(1-\delta_{i})\:\forall i\in\{1,\ldots,N\}\\ z & \ge & a_{i}+b_{i}x-M(1-\delta_{i})\:\forall i\in\{1,\ldots,N\}\end{eqnarray*} where $M$ is a suitably large constant. (This formulation will again require that $d_{0}$ and $d_{N}$ be finite.) The first constraint says that $x$ must belong to exactly one interval. If $\delta_{j}=1$, the second and third constraints force $d_{j-1}\le x\le d_{j}$ and are vacuous for all $i\neq j$ (assuming $M$ is correctly chosen). The final constraint forces $z\ge f_{j}(x)$ and is again vacuous for all $i\neq j$. Since $z$ represents something to be minimized, as before we expect $z=f_{j}(x)$ in almost all cases, and can manually adjust it in the few instances were the optimal solution contains padding.


This comment has been removed by the author.
ReplyDeleteThank you for all these details. Do you recommend any free software to handle SOS2?. Regards
ReplyDeleteI believe that CBC (which you'll find at http://www.coin-or.org/projects/) has SOS2 support. I'm not sure about the other COIN MIP solvers listed there; some of them may as well. If you are a college/university professor, you can get free licenses to CPLEX and Gurobi. CPLEX supports SOS2, and I'm pretty sure Gurobi does as well.
ReplyDeleteDear Paul,
ReplyDeleteIt's a excellent post. I have academic version of cplex. As well as I have Ampl. I am requesting you to give me the foloowing information. How can I declare a SOS2 type variable using AMPL? I am new in this area.
Regards,
Dewan
If the reason to use SOS2 constraints is to express piecewise-linear constraints, AMPL has its own notation for such constraints (involving lists of breakpoints and slopes), and my understanding is that SOS2-savvy solvers such as CPLEX will automatically use SOS2 to implement those constraints. The notation is covered in the AMPL book (chapter 17 maybe -- I don't have a current edition).
ReplyDeleteIf for some reason you want or need to express SOS2 constraints directly, there's an example late in this thread: https://groups.google.com/forum/?fromgroups#!topic/ampl/Mo8F0dvHWms. You have to define two suffixes, sosno and ref, and apply them to the variables in the special ordered sets. The sosno suffix identifies the SOS to which the variable belongs, and if I recall correctly (a big "if" in this case) the sign of the sosno determines the type of SOS (positive -> SOS1, negative -> SOS2). For SOS2, the ref suffix indicates the sequencing of the variables. For SOS1, I believe the ref suffix assigns weights (but, again, my memory is a bit shaky).
Dear Paul,
ReplyDeleteThank you very much for your quick reply. The info you described above is surely helpful for me to work with SOS2 constraints.
Regards,
Dewan
Dear Paul,
ReplyDeleteGood afternoon.
I have used the concept of "economy of scale" in my project. And its giving optimal results for my project. Before going to defend my project, I am searching a paper or document to proof the optimality of "economy of scale" concept that you discussed above. I will be very much grateful to you, if you provide me the link some papers.
Best regards,
Dewan
@Dewan,
ReplyDeleteI'm not sure exactly what you want, but I think you might find it in the book "Model Building in Mathematically Programming" by H. Paul Williams. I don't know what the current edition is, but in the fourth edition you might want to look in chapter 7. (The index helpfully includes an entry for "economies of scale".)
Good morning Paul,
ReplyDeleteThank you very much for the information.
I requested you about "proof of global optimality and convergence" of a concave objective function like "economy of scale" function. Currently I am working to piece-wise linear approximate a concave objective function of one variable that contains an abs(sin(X)), i.e., a quadratic concave cost function of one variable along with absolute valve of SIN(X). The "economy of scale" concept that you described above is working fine for my piece-wise linear approximating of a concave objective function with ABS(SIN(X)) after applying the concept of "economy of scale". And I am getting global optimal solution but I don't how I can proof that it gives a global optimal solution. If I defend my model without knowing that reviewers can ask me what the proof of global optimality and convergence of the concave cost function.
I found "proof of global optimality and convergence" of non-convex cost function but I could not find any proof for concave cost function.
I will read the chapter 7 of the book. Thanks for the info.
Best regards,
Dewan
@Dewan: I think there are two issues here: whether the solution to the MILP model is actually a global optimum of the pw-linear approximation; and whether the global optimum of the pw-linear approximation is adequately good as an approximation to the optimum of the original problem. I don't know off-hand a published proof of the first, but that's because it is a rather trivial proof. Suppose that the MILP optimum is not optimal for the pw-linear approximation. Then there exists a solution feasible x* in the pw-linear problem with a strictly superior objective value. Simply select the correct values for the binary variables (i.e., the ones that select the segment containing x*), demonstrate that with those values for the binaries and x = x* you have a feasible solution to the MILP, and assert a contradiction to the optimality of the MILP solution.
ReplyDeleteAs far as the second point goes, if the issue is how close the MILP objective is to optimal, you just need to compute the worst difference between the MILP objective and your original objective. If the issue is how close the MILP solution x* is to the true global solution (say, in Euclidean norm), that's a bit trickier. You could conceivably be on the opposite side of the feasible region from the true optimal x and be quite close in objective value.