Monday, October 29, 2018

Pseudocode in LyX

Fair warning: This post is for LyX users only. When I'm writing a paper or presentation in LaTeX (using LyX, of course) and want to include a program chunk or algorithm in pseudocode, I favor the algorithmicx package (and specifically the algpseudocode style). There being no intrinsic support for the package in LyX, I have to enter the various commands as raw LaTeX (or, in LyX parlance, "ERT" -- "Evil Red Text", so named for historical reasons). Unfortunately, I do this seldom enough that I do not remember said commands, so each sojourn into pseudocode involves finding and opening the documentation file for algorithmicx.

I finally decided to fix that by writing a pseudocode module for LyX. The beta version (which seems pretty stable to me) is available under a GPLv2 license from my Github repository. Besides a "gitignore" file (which you should, as the name suggests, ignore), there is the module file and a combination user guide/demonstration LyX document. Installation instructions are included in the latter. You will, of course, need to install the algorithmicx package before using the module.

Should you decide to try it, there's an issue tracker in the repository where you can report bugs or feature requests. I hope someone else can get some use out of this, but even if not, the development time will pay itself back the next time I need to write some pseudocode.

Sunday, October 28, 2018

B.S.-ing Precisely

In a recent blog post titled "Excessive Precision", John D. Cook points out the foolishness of articulating results to an arbitrarily high degree of precision when the inputs are themselves not that precise. To quote him:
Excessive precision is not the mark of the expert. Nor is it the mark of the layman. It’s the mark of the intern.
He also mentions that it is typically difficult to assess the precision of the output even if we know the precision of the input. This is in part a matter of possible nonlinearity (and quite possibly opaqueness, as in "black box") in the mechanism that transforms inputs into outputs. It can also be caused by the inherent imprecision of floating point arithmetic (rounding error, truncation error, spiteful quantum effects, ...).

In operations research, there are myriad other sources of imprecision. Your conceptual model of the system is an approximation of the actual system. You may have linearized things that are not linear, or used "convenient" nonlinear representations (polynomials, exponential functions) for things that are nonlinear in a goofy way. If you are like me, you will have ignored randomness entirely, because the work is hard enough even when you pretend determinism. (Also, I confuse easily.) If you did bake in some explicit consideration of randomness, you likely approximated that. (How many things in life are really normally distributed?) There's also the matter of the passage of time. Did you make all the coefficients, distributions etc. time-varying? Didn't think so (and don't really blame you). Throw in estimation error for the model parameters and you have a witches' brew of imprecision. (It's almost Halloween, so I had to work in at least one spooky metaphor.)

This brings to mind two running questions that I encounter with discrete optimization models. I doubt either has a definitive answer. The first question is whether there is any benefit to running the solver all the way to "proven optimality" (gap nearly zero) if everything is so approximate. My personal preference is to do it when it doesn't take too long (why not?) but not bother if the gap is closing at a painfully slow rate.

The second question is whether to bother using a MIP model and solver at all, or just run some metaheuristic (preferably one with a cool biologically-inspired name, such as "banana slug foraging optimization"). After all, if you are just getting an answer to an approximation of the actual problem, why not get an approximate answer to the approximation? My personal inclination is to solve the model exactly when possible, in the hope that the model is at least "close" to reality and thus an optimal solution to the model will be "close" to an optimal solution to the real problem. At the same time, I recognize that there is a lot of hoping going on there, so if getting an exact solution is painful, I'll switch to a heuristic or metaheuristic and hope that the answer I get proves decent in practice.

Either way, I'm not going to claim the "optimal" value of some variable is 2.31785 with any degree of certainty. It's about 2.3 (if we're lucky). On the bright side, a lot of the problems I deal with have binary variables. There's not a lot of concern there about artificial precision; the only concern is artificial confidence in the solution.