Someone posted a nonconvex nonlinear optimization model on OR Stack Exchange and asked for advice about possible reformulations, piecewise linear approximations, using global optimizers, and other things. The model is as follows:maxq1+q2s.t.n∑i=1pixi=T∑t=0Ft(1+q1)t(1)n∑i=1pixi=T∑t=0n∑i=1bt,ixi(1+q2)t(2)n∑i=1pixi=βT∑t=0Ft(1+q1+q2)t(3)q1,q2≥0x∈[0,1]n All symbols other than q1, q2 and x are model parameters (or indexes). The author originally had x as binary variables, apparently believing that would facilitate linearization of products, but also expressed interest in the case where x is continuous. I'm going to propose a possible "low-tech" solution procedure for the continuous case. The binary case is probably a bit too tough for me. The author supplied sample data for all parameters except β, with dimensions n=3 and T=4 but expressed a desire to solve the model for n=10,000 and T=1,200 (gulp).
Note that the left-hand sides (LHSes) of the three constraints are identical. Let h() be the function on the right-hand side (RHS) of constraint (1), so that the RHS of (1) is h(q1). h() is a monotonically decreasing function. The RHS of (3) is βh(q1+q2). Since the left sides are equal, we have βh(q1+q2)=h(q1)(4)which tells us several things. First, q2≥0⟹h(q1+q2)≤h(q1), so if β<1 it is impossible to satisfy (4). Second, if β=1, (4) implies that q2=0, which simplifies the problem a bit. Lastly, let's assume β>1. For fixed q1 the LHS of (4) is monotonically decreasing in q2, with the LHS greater than the RHS when q2=0 and with limq2→∞βh(q1+q2)=βF0. If βF0>h(q1), there is no q2 that can balance equation (4), and so the value of q1 is infeasible in the model. If βF0<h(q1), then there is exactly one value of q2 for which (4) holds, and we can find it via line search.
Next, suppose that we have a candidate value for q1 and have found the unique corresponding value of q2 by solving (4). We just need to find a vector x∈[0,1]n that satisfies (1) and (2). Equation (3) will automatically hold if (1) does, given (4). We can find x by solving a linear program that minimizes 0 subject to (1), (2) and the bounds for x.
Thus, we have basically turned the problem into a line search on q1. Let's set an arbitrary upper limit of Q for q1 and q2, so that our initial "interval of uncertainty" for q1 is [0,Q]. It's entirely possible that neither 0 nor Q is a feasible value for q1, so our first task is to search upward from 0 until we find a feasible value (call it Qℓ) for q1, then downward from Q until we find a feasible value (call it Qh) for q1. After that, we cross our fingers and hope that all q1∈[Qℓ,Qh] are feasible. I think this is true, although I do not have a proof. (I'm much less confident that it is true if we require x to be binary.) Since q2 is a function of q1 and the objective function does not contain x, we can search [Qℓ,Qh] for a local optimum (for instance, by golden section search) and hope that the objective function is unimodal as a function of q1, so that the local optimum is a global optimum. (Again, I do not have proof, although I would not be surprised if it were true.)
I put this to the test with an R script, using the author's example data. The linear programs were solved using CPLEX, with the model expressed via the OMPR package for R and using ROI as the intermediary between OMPR and CPLEX. I first concocted an arbitrary feasible solution and used it to compute β, so that I would be sure that the problem was feasible with my choice of β. Using β=1.01866 and 100 (arbitrarily chosen) as the initial upper bounds for q1 and q2, my code produced an "optimal" solution of q1=5.450373, q2=0.4664311, x=(1,0.1334608,0) with objective value 5.916804. There is a bit of rounding error involved: the common LHS of (1)-(3) evaluated to 126.6189, while the three RHSes were 126.6186, 126.6188, and 126.6186. (In my graduate student days, our characterization of this would be "good enough for government work".) Excluding loading libraries, the entire process took under three seconds on my desktop computer.
You can access my R code from my web site. It is in the form of an R notebook (with the code embedded), so even if you are not fluent in R, you can at least read the "prose" portions and see some of the nagging details involved.
Small typo in (2). Pretty sure that the index under the 3rd summation should be i, not n.
ReplyDeleteThanks for pointing that out. It's fixed now.
DeleteI happens to be the one who asked the question on OR StackExchange. I didn't realise that you have written a blog here. I think you are being modest when you say "low-tech" solution. To me it looks like a very elegant solution. Using your assumption for β, I get the same answer with Couenne solver. I see that you have given your R code as well but I don’t understand R. I can follow the code up to the point you solve the LP to find x, and then you find a lower bound for q1. If I may ask some questions on the method:
ReplyDelete(1) What bracketing interval you assume to find feasible values for q1 using line search when setting the (feasible) lower bound and upper bound?
(2) What is the objective function for the golden search? Is it given by LHS – RHS of (1) that you minimize?
(3) Are you iteratively solving the LP based on candidate solutions for q1 and q2 until you hit the minimum based on Golden Search?
Hi,
Delete(1) I start out by setting upper limits of 100 for q1 and q2, snatched out of thin air. In practice, the context of the model (what q1 and q2 stand for) would hopefully let you set reasonable upper bounds. To get an initial lower bound for q1, I find the maximum value of the left side of (1). In the R code, I assume the pi are nonnegative, in which case the left side is maxed out when all xi=1. I then solve (1) to get a value for q1, which is the smallest possible choice for q1. Neither the lower nor upper bound is automatically feasible, so I do a line search up from the lower bound and a separate line search down from the upper bound to get the lowest and highest values of q1 than are feasible.
(2) The objective function for golden section is q1+q2, the original objective function.
(3) At each step of the golden section search, we get a new candidate value for q1. I call the findQ2() function to find the corresponding value of q2. That function in turn calls findX(), which solves an LP to get x. So there is an LP solution at each step of the GS search.
I hope that clarifies things.
Thank you for the detailed explanation. Overall method is clear to me. I would like to clarify one point regarding the interval of uncertainty.
ReplyDeleteIn the R code, I note that you assume a step size of 0.1 and keep increasing the value of q1 by this amount until you find a feasible value. That code calls findQ2() function which in turn calls findX(). To find the upper bound, you use bisection method with a bracket of [q1min,100] to find a root for q1.
I do not understand why different methods are used for lower and upper bounds. Can the bisection method be used to find the lower bound as well for q1?
You can use bisection (which may be faster than a fixed step approach) to find the lower bound *if* you have a feasible value you can use as the upper limit of the search interval. You cannot use bisection is you are starting with an interval where both endpoints are infeasible (because, after checking the midpoint, you would have no idea which half interval to keep). So you need a feasible value of q1 before you do anything else. I suppose you could randomly generate values of q1 in the initial interval until you got a feasible one, then use bisection (twice) to get feasible lower and upper limits.
Delete