Tuesday, May 5, 2026

Identifying Extreme Points

I found a recent blog post by Erwin Kalvelagen titled "Convex hull models" rather interesting. Although the title mentions convex hulls, what Erwin actually discusses is finding the extreme points of the convex hull of a finite point set. (The full convex hull would also include bounding hyperplanes, which is a separate, arguably thornier, matter.) Erwin presents both a mixed integer nonlinear programming (MINLP) model and two mixed integer linear programming (MILP) models. In computational tests, they all solved at the root node, leading Erwin to make the following observation: "[a]s we consistently get integer models that need 0 nodes it is likely this model is really an LP." He tested the LP relaxation of one of his models, and it seemed to work.

My interest stems in part from the fact that after contemplating the problem a bit I assumed it could be solved as an LP. Assume we have a finite set $P=\lbrace p_1,\dots,p_m\rbrace\subset \mathbb{R}^n$ of points. Our mission is to determine which are the extreme points of $\mathrm{conv}(P),$ the convex hull of $P.$ A starting point is the recognition that the extreme points of $\mathrm{conv}(P)$ must belong to $P,$ and that an extreme point cannot be a convex combination of other points in $P.$

Now consider this system of linear equations and inequalities: 

$$p_i  = \sum^{m}_{k=1}\lambda_{k}p_k\quad (1)$$$$\sum^{m}_{k=1}\lambda_{k} = 1\quad (2)$$$$\lambda_{k} \ge0\ \ \forall k \quad (3).$$

This expresses point $p_i$ as a convex combination of the points in $P$ including itself. So a trivial solution is $\lambda_i = 1$ and $\lambda_j = 0$ for all $j\neq i.$ If a solution exists with $\lambda_i=0$ then $p_i$ is a convex combination of the other points and therefore not extreme. If there is no solution with $\lambda_i=0,$ then $p_i$ is not a convex combination of other points and therefore must be extreme. This logic, by the way, is also embedded in Erwin's models. 

Suppose that we find a solution with $0 < \lambda_i < 1.$ Combining the $p_i$ terms on the left side of (1), we have $$(1-\lambda_i) p_i = \sum_{k \neq i} \lambda_k p_k$$ and therefore $$p_i = \sum_{k \neq i} \frac{\lambda_k}{1-\lambda_i} p_k.$$ Since $$\sum_{k=1}^m \lambda_k = 1 \implies \sum_{k\neq i}  \frac{\lambda_k}{1-\lambda_i} = 1,$$ we know that if a solution to (1)-(3) exists with $\lambda_i < 1$ then a solution exists with $\lambda_i = 0.$

So we can solve the problem of minimizing $\lambda_i$ subject to the constraints above. If the minimum value is 0, $p_i$ is not an extreme point. If the minimum value is strictly greater than 0, then it must be 1, which means that $p_i$ cannot be written as a convex combination of the other points and therefore must be extreme.

Along the lines of what Erwin did, we can write a single LP that identifies all extreme points. We add a second index to $\lambda,$ so that $\lambda_{i,k}$ is the weight assigned to $p_k$ when writing $p_i$ as a convex combination of the points. We then stack the constraints for each $p_i$ into a single model and minimize $\sum_i \lambda_{i,i}.$ Positive values of the objective terms identify extreme points. That said, I find it perhaps more convenient to solve a sequence of smaller LPs, one for each point in $P.$

As a proof of concept, I wrote an R Markdown document that generates a random set of points in two dimensions and uses the single-point model to find extreme points. I stuck to two dimensions to allow for easy plotting of the results. It uses the lpSolve package, a wrapper for the open-source lp_solve LP/MIP solver. (It also needs the ggplot2 package for plotting purposes.) You can download the .Rmd file and try it for yourself (assuming that you have R and the two aforementioned packages installed). On my fairly humble PC, tow-dimensional test cases with 1,000 points solved in the blink of an eye. Of course, solution times increase as dimension and sample size increase. It is an empirical question (for which I have no answer) whether solving for all points at once in a single model is faster or slower than solving for each point individually. To belabor the obvious, doing all in a single model will require much more memory.